public inbox for [email protected]help / color / mirror / Atom feed
pgsql: Inline CRC computation for small fixed-length input on x86 13+ messages / 5 participants [nested] [flat]
* pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 06:28 John Naylor <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: John Naylor @ 2025-03-31 06:28 UTC (permalink / raw) To: [email protected] Inline CRC computation for small fixed-length input on x86 pg_crc32c.h now has a simplified copy of the loop in pg_crc32c_sse42.c suitable for inlining where possible. This may slightly reduce contention for the WAL insertion lock, but that hasn't been tested. The motivation for this change is avoid regressing for a future commit that will use a function pointer for non-constant input in all x86 builds. While it's technically possible to make a similar change for Arm and LoongArch, there are some questions about how inlining should work since those platforms prefer stricter alignment. There are also no immediate plans to add additional implementations for them. Reviewed-by: Nathan Bossart <[email protected]> Reviewed-by: Raghuveer Devulapalli <[email protected]> Discussion: https://postgr.es/m/CANWCAZZEiTzhZcuwTiJ2=opiNpAUn1vuDRu1N02z61AthwRZLA@mail.gmail.com Discussion: https://postgr.es/m/CANWCAZYRhLHArpyfV4uRK-Rw9N5oV5HMkkKtBehcuTjNOMwCZg@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/e2809e3a1015697832ee4d37b75ba1cd0caac0f0 Modified Files -------------- src/include/port/pg_crc32c.h | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 07:18 John Naylor <[email protected]> parent: John Naylor <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: John Naylor @ 2025-03-31 07:18 UTC (permalink / raw) To: John Naylor <[email protected]>; +Cc: [email protected] On Mon, Mar 31, 2025 at 1:28 PM John Naylor <[email protected]> wrote: > > Inline CRC computation for small fixed-length input on x86 Hmm, skimmer doesn't like this, and it's one of the animals that builds with -msse4.2: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skimmer&dt=2025-03-31%2007%3A00%3A... "checking which CRC-32C implementation to use... SSE 4.2" I'm now looking for clues as to what could be causing the build failure. -- John Naylor Amazon Web Services ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 08:07 John Naylor <[email protected]> parent: John Naylor <[email protected]> 0 siblings, 2 replies; 13+ messages in thread From: John Naylor @ 2025-03-31 08:07 UTC (permalink / raw) To: John Naylor <[email protected]>; +Cc: [email protected]; [email protected] On Mon, Mar 31, 2025 at 2:18 PM John Naylor <[email protected]> wrote: > Hmm, skimmer doesn't like this, and it's one of the animals that > builds with -msse4.2: > > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skimmer&dt=2025-03-31%2007%3A00%3A... > > "checking which CRC-32C implementation to use... SSE 4.2" > > I'm now looking for clues as to what could be causing the build failure. Looking at the configure output, I don't see -msee4.2 (or equivalent), so it shouldn't be reporting that it's targeting SSE 4.2. Maybe it's using a cached value, and deleting the configure cache would clear this up? (CC'-d owner) -- John Naylor Amazon Web Services ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 08:34 Kyotaro Horiguchi <[email protected]> parent: John Naylor <[email protected]> 1 sibling, 1 reply; 13+ messages in thread From: Kyotaro Horiguchi @ 2025-03-31 08:34 UTC (permalink / raw) To: [email protected]; +Cc: [email protected]; [email protected]; [email protected] At Mon, 31 Mar 2025 15:07:36 +0700, John Naylor <[email protected]> wrote in > Looking at the configure output, I don't see -msee4.2 (or equivalent), > so it shouldn't be reporting that it's targeting SSE 4.2. Maybe it's > using a cached value, and deleting the configure cache would clear > this up? (CC'-d owner) I'm not sure if it's related to this, but I got the following build error. mm_crc32_u64' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32' 70 | crc = _mm_crc32_u64(crc, *(const uint64 *) p); | ^ ../../../../src/include/port/pg_crc32c.h:73:10: error: always_inline function '_mm_crc32_u32' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32' 73 | crc = _mm_crc32_u32(crc, *(const uint32 *) p); | ^ ../../../../src/include/port/pg_crc32c.h:75:10: error: always_inline function '_mm_crc32_u8' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32' 75 | crc = _mm_crc32_u8(crc, *p++); | ^ 3 errors generated. > Rocky Linux release 9.5 (Blue Onyx) > gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) > clang version 18.1.8 (RESF 18.1.8-3.el9) regards. -- Kyotaro Horiguchi NTT Open Source Software Center ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 08:46 John Naylor <[email protected]> parent: Kyotaro Horiguchi <[email protected]> 0 siblings, 0 replies; 13+ messages in thread From: John Naylor @ 2025-03-31 08:46 UTC (permalink / raw) To: Kyotaro Horiguchi <[email protected]>; +Cc: [email protected]; [email protected]; [email protected] On Mon, Mar 31, 2025 at 3:34 PM Kyotaro Horiguchi <[email protected]> wrote: > I'm not sure if it's related to this, but I got the following build error. > 3 errors generated. > > > Rocky Linux release 9.5 (Blue Onyx) > > gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) > > clang version 18.1.8 (RESF 18.1.8-3.el9) Must be related. Is there anything special about this build? CFLAGS? Are you building with autoconf or meson? Could you please share the CRC section from output of configure (or meson)? -- John Naylor Amazon Web Services ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 11:16 John Naylor <[email protected]> parent: John Naylor <[email protected]> 1 sibling, 2 replies; 13+ messages in thread From: John Naylor @ 2025-03-31 11:16 UTC (permalink / raw) To: John Naylor <[email protected]>; +Cc: [email protected]; [email protected] On Mon, Mar 31, 2025 at 3:07 PM John Naylor <[email protected]> wrote: > > On Mon, Mar 31, 2025 at 2:18 PM John Naylor <[email protected]> wrote: > > > Hmm, skimmer doesn't like this, and it's one of the animals that > > builds with -msse4.2: > > > > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skimmer&dt=2025-03-31%2007%3A00%3A... > > > > "checking which CRC-32C implementation to use... SSE 4.2" > > > > I'm now looking for clues as to what could be causing the build failure. > > Looking at the configure output, I don't see -msee4.2 (or equivalent), > so it shouldn't be reporting that it's targeting SSE 4.2. Another clue: A few RHEL 9 x86_64 machines have reported in (webworm, shikra) with successful builds, and they also report targeting SSE 4.2 and also don't have special CFLAG's. We do know RHEL 9 has a policy of always targeting x86_64-v2, and it seems that they don't require user/packager intervention to achieve that, so I imagine their packaged compiler always defines __SSE4_2__ etc. The two problem systems are CentOS stream 9 (apparently using LTO), and Rocky Linux 9 (still awaiting details). Both of these are supposed to be like RHEL 9. I found these old gcc bug reports for a similar symptom when using LTO: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71991 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84926 ...so one theory is that the OS's with failing builds have let a similar bug back in. A possible workaround would be add a normally-superfluous "pg_attribute_target("sse4.2")" to the inlined function, as in the attached. -- John Naylor Amazon Web Services Attachments: [text/x-patch] v1-workaround-possible-lto-gcc-bug.patch (477B, 2-v1-workaround-possible-lto-gcc-bug.patch) download | inline diff: diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h index 0ab7513f523..b4198975fa6 100644 --- a/src/include/port/pg_crc32c.h +++ b/src/include/port/pg_crc32c.h @@ -53,6 +53,7 @@ typedef uint32 pg_crc32c; extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len); pg_attribute_no_sanitize_alignment() +pg_attribute_target("sse4.2") static inline pg_crc32c pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len) ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 12:12 Tom Lane <[email protected]> parent: John Naylor <[email protected]> 1 sibling, 2 replies; 13+ messages in thread From: Tom Lane @ 2025-03-31 12:12 UTC (permalink / raw) To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected] John Naylor <[email protected]> writes: > The two problem systems are CentOS stream 9 (apparently using LTO), > and Rocky Linux 9 (still awaiting details). Both of these are supposed > to be like RHEL 9. I have reproduced it on a genuine-Red-Hat RHEL 9 x86_64 machine, but only when compiling with --with-llvm, and the error goes away if I select CC=clang. Furthermore, configure reports checking which CRC-32C implementation to use... SSE 4.2 with CC=gcc but it says checking which CRC-32C implementation to use... SSE 4.2 with runtime check with CC=clang. Furthermore, the failure doesn't occur when gcc compiles a file, but it does occur when clang compiles the same file to produce a .bc file: [transam]$ make twophase.o gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wshadow=compatible-local -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -O2 -I../../../../src/include -D_GNU_SOURCE -c -o twophase.o twophase.c [transam]$ make twophase.bc /usr/lib64/ccache/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -O2 -I../../../../src/include -D_GNU_SOURCE -flto=thin -emit-llvm -c -o twophase.bc twophase.c In file included from twophase.c:79: In file included from ../../../../src/include/access/commit_ts.h:16: In file included from ../../../../src/include/replication/origin.h:15: In file included from ../../../../src/include/access/xlogreader.h:41: In file included from ../../../../src/include/access/xlogrecord.h:16: ../../../../src/include/port/pg_crc32c.h:70:10: error: always_inline function '_mm_crc32_u64' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32' 70 | crc = _mm_crc32_u64(crc, *(const uint64 *) p); | ^ ../../../../src/include/port/pg_crc32c.h:73:10: error: always_inline function '_mm_crc32_u32' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32' 73 | crc = _mm_crc32_u32(crc, *(const uint32 *) p); | ^ ../../../../src/include/port/pg_crc32c.h:75:10: error: always_inline function '_mm_crc32_u8' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32' 75 | crc = _mm_crc32_u8(crc, *p++); | ^ 3 errors generated. make: *** [../../../../src/Makefile.global:1097: twophase.bc] Error 1 What I conclude is that Red Hat hot-wired gcc to assume -msse4.2, but they didn't hot-wire clang the same way. This seems kind of problematic for us. Quite aside from the build failure, doesn't it mean that the .bc files are not very representative of what is in the .o files? regards, tom lane ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 12:52 Tom Lane <[email protected]> parent: Tom Lane <[email protected]> 1 sibling, 0 replies; 13+ messages in thread From: Tom Lane @ 2025-03-31 12:52 UTC (permalink / raw) To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected] I wrote: > I have reproduced it on a genuine-Red-Hat RHEL 9 x86_64 machine, > but only when compiling with --with-llvm, and the error goes away > if I select CC=clang. Also, a build with meson doesn't fall into the same trap. I'm not sure why, because I can't find where in the build.ninja file it's trying to make any .bc files other than llvmjit_types.bc --- and that, it's using clang for. regards, tom lane ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 13:09 Tom Lane <[email protected]> parent: Tom Lane <[email protected]> 1 sibling, 2 replies; 13+ messages in thread From: Tom Lane @ 2025-03-31 13:09 UTC (permalink / raw) To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected] I wrote: > What I conclude is that Red Hat hot-wired gcc to assume -msse4.2, > but they didn't hot-wire clang the same way. In confirmation of that: everything goes through fine if I manually add -msse4.2 to configure's choice of BITCODE_CFLAGS. Not sure if that line of thought can lead to a usable solution, or if it's superior to messing with the attributes on relevant functions as you mooted upthread. regards, tom lane ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 13:33 Tom Lane <[email protected]> parent: Tom Lane <[email protected]> 1 sibling, 0 replies; 13+ messages in thread From: Tom Lane @ 2025-03-31 13:33 UTC (permalink / raw) To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected] I wrote: >> What I conclude is that Red Hat hot-wired gcc to assume -msse4.2, >> but they didn't hot-wire clang the same way. > In confirmation of that: everything goes through fine if I manually > add -msse4.2 to configure's choice of BITCODE_CFLAGS. Also, possibly useful for testing purposes: I can reproduce the build failure on RHEL8, and probably elsewhere, with ./configure CFLAGS="-O2 -msse4.2" --with-llvm In this form it's clearly pilot error, because I didn't do anything to put -msse4.2 into CXXFLAGS. But this is another way of confirming that the underlying problem is different default -m switches between gcc and clang. I'm kind of surprised we have not gotten bitten by that before. regards, tom lane ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-03-31 18:58 Todd Cook <[email protected]> parent: John Naylor <[email protected]> 1 sibling, 1 reply; 13+ messages in thread From: Todd Cook @ 2025-03-31 18:58 UTC (permalink / raw) To: John Naylor <[email protected]>; John Naylor <[email protected]>; +Cc: [email protected] <[email protected]> On 3/31/25, 7:17 AM, "John Naylor" <[email protected] <mailto:[email protected]>> wrote: > A possible workaround would be add a normally-superfluous > "pg_attribute_target("sse4.2")" to the inlined function, as in the > attached. The build succeeds on skimmer with this patch and using the configure command from the build farm status page[1]. -- todd [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skimmer&dt=2025-03-31%2018%3A00%3A14 ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-04-01 04:20 John Naylor <[email protected]> parent: Tom Lane <[email protected]> 1 sibling, 0 replies; 13+ messages in thread From: John Naylor @ 2025-04-01 04:20 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected] On Mon, Mar 31, 2025 at 8:09 PM Tom Lane <[email protected]> wrote: > > I wrote: > > What I conclude is that Red Hat hot-wired gcc to assume -msse4.2, > > but they didn't hot-wire clang the same way. > > In confirmation of that: everything goes through fine if I manually > add -msse4.2 to configure's choice of BITCODE_CFLAGS. Not sure if > that line of thought can lead to a usable solution, or if it's > superior to messing with the attributes on relevant functions as > you mooted upthread. Thanks for doing additional legwork! Since the broader issue is still up in the air, and we have confirmation that the attributes will get the buildfarm green again, I'll go do that now. -- John Naylor Amazon Web Services ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: pgsql: Inline CRC computation for small fixed-length input on x86 @ 2025-04-01 06:48 John Naylor <[email protected]> parent: Todd Cook <[email protected]> 0 siblings, 0 replies; 13+ messages in thread From: John Naylor @ 2025-04-01 06:48 UTC (permalink / raw) To: Todd Cook <[email protected]>; +Cc: John Naylor <[email protected]>; [email protected] <[email protected]> On Tue, Apr 1, 2025 at 1:58 AM Todd Cook <[email protected]> wrote: > > On 3/31/25, 7:17 AM, "John Naylor" <[email protected] <mailto:[email protected]>> wrote: > > A possible workaround would be add a normally-superfluous > > "pg_attribute_target("sse4.2")" to the inlined function, as in the > > attached. > > The build succeeds on skimmer with this patch and using the configure command from > the build farm status page[1]. I see bumblebee is now green for the first time. Thanks for testing! -- John Naylor Amazon Web Services ^ permalink raw reply [nested|flat] 13+ messages in thread
end of thread, other threads:[~2025-04-01 06:48 UTC | newest] Thread overview: 13+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-03-31 06:28 pgsql: Inline CRC computation for small fixed-length input on x86 John Naylor <[email protected]> 2025-03-31 07:18 ` John Naylor <[email protected]> 2025-03-31 08:07 ` John Naylor <[email protected]> 2025-03-31 08:34 ` Kyotaro Horiguchi <[email protected]> 2025-03-31 08:46 ` John Naylor <[email protected]> 2025-03-31 11:16 ` John Naylor <[email protected]> 2025-03-31 12:12 ` Tom Lane <[email protected]> 2025-03-31 12:52 ` Tom Lane <[email protected]> 2025-03-31 13:09 ` Tom Lane <[email protected]> 2025-03-31 13:33 ` Tom Lane <[email protected]> 2025-04-01 04:20 ` John Naylor <[email protected]> 2025-03-31 18:58 ` Todd Cook <[email protected]> 2025-04-01 06:48 ` John Naylor <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox