Long time postgres developer, working at Microsoft. Account about tech, not politics. For the latter look to @AndresFreundPol
Public Key
npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j
Profile Code
nprofile1qqs0j26slyynfsd4gltsnvrkjf7gx32eyk9spm6fa5phvpcrtd32faqpzemhxue69uhhyetvv9ujuurjd9kkzmpwdejhg2kqtav
Author Public Key
npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j Show more details
Published at
2023-11-21T05:06:31+01:00 Event JSON
{
"id": "b687d686b3150c3bf7cf6381f2c7bed5ffa459ed363d4ba9d7c302839b02b331" ,
"pubkey": "f92b50f90934c1b547d709b076927c834559258b00ef49ed037607035b62a4f4" ,
"created_at": 1700539591 ,
"kind": 0 ,
"tags": [
[
"proxy",
"https://mastodon.social/users/AndresFreundTec",
"activitypub"
]
],
"content": "{\"name\":\"AndresFreundTec\",\"about\":\"Long time postgres developer, working at Microsoft.\\n\\nAccount about tech, not politics. For the latter look to @AndresFreundPol\",\"picture\":\"https://files.mastodon.social/accounts/avatars/109/362/110/832/715/599/original/6a9a410580be97af.jpg\",\"nip05\":\"[email protected] \"}" ,
"sig": "ff5bc57290d541655dd6175b59dca680832eefa6e8376fbeeea46cfb330d449b1ca9a561f19187ce425e71905b188f6b7df580627243ab57107d180b0e362b45"
}
Last Notes npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1hxa…j94a They either train as pcie 4, don't work at all, or also have AER errors. So I suspect it's a mainboard / firmware issue :/ npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1h59…waea Thanks. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1h59…waea Several DEs start themselves via systemd these days, so a graphical terminal will often have the systemd limits applied. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1h59…waea (since most things spawn from somewhere within a systemd instance these days, the pam limits are quickly overriden by systemd) npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec Here are the slides for a talk I just gave about using perf c2c to find cache line contention in postgres: https://anarazel.de/talks/2024-05-29-pgconf-dev-c2c/postgres-perf-c2c.pdf npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1hxa…j94a Yep. This is with turbo verified to be disabled, C-states disabled and frequency monitored... npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec A potentially interesting detail: If I interpret https://community.intel.com/t5/Software-Tuning-Performance/Understanding-PCICFG-space-information/m-p/1138821#M6581 correctly, both my CPUs are 18 core models with some cores disabled. Both have the same CAPID0 (0x001881fa), CAPID4 (0x24000e80), but different CAPID6: (0x0001b4e3 and 0x0002c6f8). Afaict that makes them HCC parts with 10 slices enabled. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1yc6…lmgn Nope. No difference above noise. That core is the only "really slow" one. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec I don't think that's it - I added userspace vs kernel cycles,instructions: https://gist.github.com/anarazel/ca7d1db68fb7380d21f6fd819a147df1 There are a few more kernel instructions/cycles in the slow case, but it's just because the slow case takes longer. If I measure for a fixed time, it's about the same. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec My best theory so far is that somehow on that one core there are more conflicts on L1i entries, leader to a lower hit rate. I haven't figured out what the precise keying scheme for the L1i is (it's 8 way associative). npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub142j…7ugm Yep. There are practically no interrupts, there are no SMIs, no evidence of throttling in performance counters. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1e5z…g25u It's possible, but somehow it seems odd to end up with different numbers of L1i misses etc, without causing apparent corruption. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1e5z…g25u The others perform like 11, 10 is slow. Note how in the paste above the instruction numbers are almost identical, but core 10 needed a lot more cycles... npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1e5z…g25u Core 10 is the the first core of the second socket, 11 the second. The first socket does not show the same for cores 0,1. Nor does any other combination I've tried. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub17lg…9uux Hah. This is weird, but not that kind of weird, I strongly suspect. Although I wouldn't mind getting that farm. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub16hg…mtal It's the same numa node... npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec https://gist.github.com/anarazel/ca7d1db68fb7380d21f6fd819a147df1 How can two cores on the same CPU such crazily different icache behaviour? For the same process! npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec Well, color me very confused. In a CPU bound workload two cores on the same socket have substantially different performance (32% slowdown). If I just migrate the running process between the cores, performance changes immediately. This is on 2x Xeon 5215 system. I checked that it's not thermals, cpu frequency/boost and the system is idle. Here's the odd part: The biggest difference evident in perf counters is a 2.5x difference in icache_64b.iftag_stall, with ~same icache_64b.iftag_miss. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub16ew…us82 I assume this one is also mounted with barrier=0? npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1hxa…j94a Very interesting. Thanks. A depressingly large performance diff between ext4 and btrfs, even with nocow. Interesting that dsync wins with ext4 but looses on btrfs. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub12yj…3l94 Thanks. What filesystem is this? These don't show the same slowdown we've seen with O_DSYNC/FUA for other Samsung SSDs. I suspect the filesystem doesn't use FUA writes... npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1jpa…6p97 I'm curious about your results with the SK Hynic, because they're the first non-samsung one where FUA writes are slower. Albeit at a much lower degree. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1jpa…6p97 Thanks! Any chance you could figure out the model number of either, e.g. via smartctl -xa /dev/nvme0n1 or lsblk -o path,model,fstype,size,mountpoints? npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec Collecting the information here: https://gist.github.com/anarazel/b527e5317bb7d58483a9858f5f2435ca The background is that I'd like to switch to using O_DSYNC by default for postgres' WAL, but that it appears some drives react unfavorably. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1u8f…p9n0 Thanks! npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub1p4g…8pte Thanks! npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec Going to collect the information here: https://gist.github.com/anarazel/b527e5317bb7d58483a9858f5f2435ca npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec Any chance a few folks could run the following fio command on various SSDs and tell me the latency, drive model and filesystem? fio --directory /srv/dev/fio/ --runtime 3 --time_based --output-format json --overwrite 1 --size=8MB --buffered 0 --bs=4096 --rw=write --name write-dsync --wait_for_previous --sync=dsync --name write-fdatasync --wait_for_previous --fdatasync=1 --name write-nondurable --wait_for_previous | jq '.jobs[] | [.jobname, .write.iops]' --directory needs to be adjusted. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec The "new" linux CVE approach seems ... unhelpful at best. I know from experience that dealing with security reports is a pain and I assume that's way worse for linux than for Postgres. But this just seems to purely be aimed at annoying everyone. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec Congratulations to fellow postgres hackers @npub1zcv…cczq and Richard Guo for becoming committers! https://postgr.es/m/df222085-2248-4d89-8935-256a9c384878%40postgresql.org npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec I did not think that finding a security vulnerability would lead to tabloids digging around my life. Including "analyzing" (aka making things up) the one personal-ish picture I've ever shared on social media. FFS. Oddly enough, they weren't interested in lots of kinda pretty graphs. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub17lg…9uux FWIW, I hadn't been on twitter in months, and this made me go back - at least earlier on there was distinct information, particularly around reverse engineering efforts. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec I am a bit concerned by all the focus on small-ish projects with overwhelmed maintainers. There indeed are a lot of problems in that area. But I am certain that lots of experienced OSS devs can think of a few large and crucial projects where they fairly easily could have hidden something small in a larger change. Without a lot of prior contributions to the project. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec @npub178r…qekt Even if you don't care about collateral harm, with something like a backdoor in ssh, it just seem too likely you'd otherwise accidentally make yourself vulnerable too, somewhere in your org. npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec I wholeheartedly agree with what Russ wrote here: "Also if there's anything the community can do for Lasse personally, please pass that along." "Anyone can be the victim of social engineering." "I suspect many of us here have had nightmares about being in Lasse's position, and probably will have more of them in the future." Indeed. https://www.openwall.com/lists/oss-security/2024/03/30/25 npub1ly44p7gfxnqm237hpxc8dynusdz4jfvtqrh5nmgrwcrsxkmz5n6q6gks2j AndresFreundTec I accidentally found a security issue while benchmarking postgres changes. If you run debian testing, unstable or some other more "bleeding edge" distribution, I strongly recommend upgrading ASAP. https://www.openwall.com/lists/oss-security/2024/03/29/4