public inbox for [email protected]  
help / color / mirror / Atom feed
pgsql: Inline CRC computation for small fixed-length input on x86
13+ messages / 5 participants
[nested] [flat]

* pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 06:28  John Naylor <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: John Naylor @ 2025-03-31 06:28 UTC (permalink / raw)
  To: [email protected]

Inline CRC computation for small fixed-length input on x86

pg_crc32c.h now has a simplified copy of the loop in pg_crc32c_sse42.c
suitable for inlining where possible.

This may slightly reduce contention for the WAL insertion lock,
but that hasn't been tested. The motivation for this change is avoid
regressing for a future commit that will use a function pointer for
non-constant input in all x86 builds.

While it's technically possible to make a similar change for Arm and
LoongArch, there are some questions about how inlining should work
since those platforms prefer stricter alignment. There are also no
immediate plans to add additional implementations for them.

Reviewed-by: Nathan Bossart <[email protected]>
Reviewed-by: Raghuveer Devulapalli <[email protected]>
Discussion: https://postgr.es/m/CANWCAZZEiTzhZcuwTiJ2=opiNpAUn1vuDRu1N02z61AthwRZLA@mail.gmail.com
Discussion: https://postgr.es/m/CANWCAZYRhLHArpyfV4uRK-Rw9N5oV5HMkkKtBehcuTjNOMwCZg@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/e2809e3a1015697832ee4d37b75ba1cd0caac0f0

Modified Files
--------------
src/include/port/pg_crc32c.h | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)



^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 07:18  John Naylor <[email protected]>
  parent: John Naylor <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: John Naylor @ 2025-03-31 07:18 UTC (permalink / raw)
  To: John Naylor <[email protected]>; +Cc: [email protected]

On Mon, Mar 31, 2025 at 1:28 PM John Naylor <[email protected]> wrote:
>
> Inline CRC computation for small fixed-length input on x86

Hmm, skimmer doesn't like this, and it's one of the animals that
builds with -msse4.2:

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skimmer&dt=2025-03-31%2007%3A00%3A...

"checking which CRC-32C implementation to use... SSE 4.2"

 I'm now looking for clues as to what could be causing the build failure.

-- 
John Naylor
Amazon Web Services





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 08:07  John Naylor <[email protected]>
  parent: John Naylor <[email protected]>
  0 siblings, 2 replies; 13+ messages in thread

From: John Naylor @ 2025-03-31 08:07 UTC (permalink / raw)
  To: John Naylor <[email protected]>; +Cc: [email protected]; [email protected]

On Mon, Mar 31, 2025 at 2:18 PM John Naylor <[email protected]> wrote:

> Hmm, skimmer doesn't like this, and it's one of the animals that
> builds with -msse4.2:
>
> https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skimmer&dt=2025-03-31%2007%3A00%3A...
>
> "checking which CRC-32C implementation to use... SSE 4.2"
>
>  I'm now looking for clues as to what could be causing the build failure.

Looking at the configure output, I don't see -msee4.2 (or equivalent),
so it shouldn't be reporting that it's targeting SSE 4.2. Maybe it's
using a cached value, and deleting the configure cache would clear
this up? (CC'-d owner)

-- 
John Naylor
Amazon Web Services





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 08:34  Kyotaro Horiguchi <[email protected]>
  parent: John Naylor <[email protected]>
  1 sibling, 1 reply; 13+ messages in thread

From: Kyotaro Horiguchi @ 2025-03-31 08:34 UTC (permalink / raw)
  To: [email protected]; +Cc: [email protected]; [email protected]; [email protected]

At Mon, 31 Mar 2025 15:07:36 +0700, John Naylor <[email protected]> wrote in 
> Looking at the configure output, I don't see -msee4.2 (or equivalent),
> so it shouldn't be reporting that it's targeting SSE 4.2. Maybe it's
> using a cached value, and deleting the configure cache would clear
> this up? (CC'-d owner)

I'm not sure if it's related to this, but I got the following build error.

mm_crc32_u64' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32'
   70 |                         crc = _mm_crc32_u64(crc, *(const uint64 *) p);
      |                               ^
../../../../src/include/port/pg_crc32c.h:73:10: error: always_inline function '_mm_crc32_u32' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32'
   73 |                         crc = _mm_crc32_u32(crc, *(const uint32 *) p);
      |                               ^
../../../../src/include/port/pg_crc32c.h:75:10: error: always_inline function '_mm_crc32_u8' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32'
   75 |                         crc = _mm_crc32_u8(crc, *p++);
      |                               ^
3 errors generated.

> Rocky Linux release 9.5 (Blue Onyx)
> gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)
> clang version 18.1.8 (RESF 18.1.8-3.el9)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 08:46  John Naylor <[email protected]>
  parent: Kyotaro Horiguchi <[email protected]>
  0 siblings, 0 replies; 13+ messages in thread

From: John Naylor @ 2025-03-31 08:46 UTC (permalink / raw)
  To: Kyotaro Horiguchi <[email protected]>; +Cc: [email protected]; [email protected]; [email protected]

On Mon, Mar 31, 2025 at 3:34 PM Kyotaro Horiguchi
<[email protected]> wrote:
> I'm not sure if it's related to this, but I got the following build error.

> 3 errors generated.
>
> > Rocky Linux release 9.5 (Blue Onyx)
> > gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)
> > clang version 18.1.8 (RESF 18.1.8-3.el9)

Must be related.
Is there anything special about this build? CFLAGS?
Are you building with autoconf or meson?
Could you please share the CRC section from output of configure (or meson)?

-- 
John Naylor
Amazon Web Services





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 11:16  John Naylor <[email protected]>
  parent: John Naylor <[email protected]>
  1 sibling, 2 replies; 13+ messages in thread

From: John Naylor @ 2025-03-31 11:16 UTC (permalink / raw)
  To: John Naylor <[email protected]>; +Cc: [email protected]; [email protected]

On Mon, Mar 31, 2025 at 3:07 PM John Naylor <[email protected]> wrote:
>
> On Mon, Mar 31, 2025 at 2:18 PM John Naylor <[email protected]> wrote:
>
> > Hmm, skimmer doesn't like this, and it's one of the animals that
> > builds with -msse4.2:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skimmer&dt=2025-03-31%2007%3A00%3A...
> >
> > "checking which CRC-32C implementation to use... SSE 4.2"
> >
> >  I'm now looking for clues as to what could be causing the build failure.
>
> Looking at the configure output, I don't see -msee4.2 (or equivalent),
> so it shouldn't be reporting that it's targeting SSE 4.2.

Another clue: A few RHEL 9 x86_64 machines have reported in (webworm,
shikra) with successful builds, and they also report targeting SSE 4.2
and also don't have special CFLAG's. We do know RHEL 9 has a policy of
always targeting x86_64-v2, and it seems that they don't require
user/packager intervention to achieve that, so I imagine their
packaged compiler always defines __SSE4_2__ etc.

The two problem systems are CentOS stream 9 (apparently using LTO),
and Rocky Linux 9 (still awaiting details). Both of these are supposed
to be like RHEL 9.

I found these old gcc bug reports for a similar symptom when using LTO:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71991
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84926

...so one theory is that the OS's with failing builds have let a
similar bug back in.

A possible workaround would be add a normally-superfluous
"pg_attribute_target("sse4.2")" to the inlined function, as in the
attached.

--
John Naylor
Amazon Web Services


Attachments:

  [text/x-patch] v1-workaround-possible-lto-gcc-bug.patch (477B, 2-v1-workaround-possible-lto-gcc-bug.patch)
  download | inline diff:
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 0ab7513f523..b4198975fa6 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -53,6 +53,7 @@ typedef uint32 pg_crc32c;
 extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
 
 pg_attribute_no_sanitize_alignment()
+pg_attribute_target("sse4.2")
 static inline
 pg_crc32c
 pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 12:12  Tom Lane <[email protected]>
  parent: John Naylor <[email protected]>
  1 sibling, 2 replies; 13+ messages in thread

From: Tom Lane @ 2025-03-31 12:12 UTC (permalink / raw)
  To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected]

John Naylor <[email protected]> writes:
> The two problem systems are CentOS stream 9 (apparently using LTO),
> and Rocky Linux 9 (still awaiting details). Both of these are supposed
> to be like RHEL 9.

I have reproduced it on a genuine-Red-Hat RHEL 9 x86_64 machine,
but only when compiling with --with-llvm, and the error goes away
if I select CC=clang.  Furthermore, configure reports

checking which CRC-32C implementation to use... SSE 4.2

with CC=gcc but it says

checking which CRC-32C implementation to use... SSE 4.2 with runtime check

with CC=clang.  Furthermore, the failure doesn't occur when gcc
compiles a file, but it does occur when clang compiles the same
file to produce a .bc file:

[transam]$ make twophase.o
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wshadow=compatible-local -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -O2 -I../../../../src/include   -D_GNU_SOURCE   -c -o twophase.o twophase.c
[transam]$ make twophase.bc
/usr/lib64/ccache/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -O2  -I../../../../src/include   -D_GNU_SOURCE  -flto=thin -emit-llvm -c -o twophase.bc twophase.c
In file included from twophase.c:79:
In file included from ../../../../src/include/access/commit_ts.h:16:
In file included from ../../../../src/include/replication/origin.h:15:
In file included from ../../../../src/include/access/xlogreader.h:41:
In file included from ../../../../src/include/access/xlogrecord.h:16:
../../../../src/include/port/pg_crc32c.h:70:10: error: always_inline function '_mm_crc32_u64' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32'
   70 |                         crc = _mm_crc32_u64(crc, *(const uint64 *) p);
      |                               ^
../../../../src/include/port/pg_crc32c.h:73:10: error: always_inline function '_mm_crc32_u32' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32'
   73 |                         crc = _mm_crc32_u32(crc, *(const uint32 *) p);
      |                               ^
../../../../src/include/port/pg_crc32c.h:75:10: error: always_inline function '_mm_crc32_u8' requires target feature 'crc32', but would be inlined into function 'pg_comp_crc32c_dispatch' that is compiled without support for 'crc32'
   75 |                         crc = _mm_crc32_u8(crc, *p++);
      |                               ^
3 errors generated.
make: *** [../../../../src/Makefile.global:1097: twophase.bc] Error 1

What I conclude is that Red Hat hot-wired gcc to assume -msse4.2,
but they didn't hot-wire clang the same way.

This seems kind of problematic for us.  Quite aside from the build
failure, doesn't it mean that the .bc files are not very
representative of what is in the .o files?

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 12:52  Tom Lane <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 0 replies; 13+ messages in thread

From: Tom Lane @ 2025-03-31 12:52 UTC (permalink / raw)
  To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected]

I wrote:
> I have reproduced it on a genuine-Red-Hat RHEL 9 x86_64 machine,
> but only when compiling with --with-llvm, and the error goes away
> if I select CC=clang.

Also, a build with meson doesn't fall into the same trap.  I'm not
sure why, because I can't find where in the build.ninja file it's
trying to make any .bc files other than llvmjit_types.bc --- and
that, it's using clang for.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 13:09  Tom Lane <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 2 replies; 13+ messages in thread

From: Tom Lane @ 2025-03-31 13:09 UTC (permalink / raw)
  To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected]

I wrote:
> What I conclude is that Red Hat hot-wired gcc to assume -msse4.2,
> but they didn't hot-wire clang the same way.

In confirmation of that: everything goes through fine if I manually
add -msse4.2 to configure's choice of BITCODE_CFLAGS.  Not sure if
that line of thought can lead to a usable solution, or if it's
superior to messing with the attributes on relevant functions as
you mooted upthread.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 13:33  Tom Lane <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 0 replies; 13+ messages in thread

From: Tom Lane @ 2025-03-31 13:33 UTC (permalink / raw)
  To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected]

I wrote:
>> What I conclude is that Red Hat hot-wired gcc to assume -msse4.2,
>> but they didn't hot-wire clang the same way.

> In confirmation of that: everything goes through fine if I manually
> add -msse4.2 to configure's choice of BITCODE_CFLAGS.

Also, possibly useful for testing purposes: I can reproduce the
build failure on RHEL8, and probably elsewhere, with

./configure CFLAGS="-O2 -msse4.2" --with-llvm

In this form it's clearly pilot error, because I didn't do anything
to put -msse4.2 into CXXFLAGS.  But this is another way of confirming
that the underlying problem is different default -m switches between
gcc and clang.  I'm kind of surprised we have not gotten bitten by
that before.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-03-31 18:58  Todd Cook <[email protected]>
  parent: John Naylor <[email protected]>
  1 sibling, 1 reply; 13+ messages in thread

From: Todd Cook @ 2025-03-31 18:58 UTC (permalink / raw)
  To: John Naylor <[email protected]>; John Naylor <[email protected]>; +Cc: [email protected] <[email protected]>

On 3/31/25, 7:17 AM, "John Naylor" <[email protected] <mailto:[email protected]>> wrote:
> A possible workaround would be add a normally-superfluous
> "pg_attribute_target("sse4.2")" to the inlined function, as in the
> attached.

The build succeeds on skimmer with this patch and using the configure command from
the build farm status page[1].

-- todd

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skimmer&dt=2025-03-31%2018%3A00%3A14




^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-04-01 04:20  John Naylor <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 0 replies; 13+ messages in thread

From: John Naylor @ 2025-04-01 04:20 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Andres Freund <[email protected]>; [email protected]; [email protected]

On Mon, Mar 31, 2025 at 8:09 PM Tom Lane <[email protected]> wrote:
>
> I wrote:
> > What I conclude is that Red Hat hot-wired gcc to assume -msse4.2,
> > but they didn't hot-wire clang the same way.
>
> In confirmation of that: everything goes through fine if I manually
> add -msse4.2 to configure's choice of BITCODE_CFLAGS.  Not sure if
> that line of thought can lead to a usable solution, or if it's
> superior to messing with the attributes on relevant functions as
> you mooted upthread.

Thanks for doing additional legwork!

Since the broader issue is still up in the air, and we have
confirmation that the attributes will get the buildfarm green again,
I'll go do that now.

-- 
John Naylor
Amazon Web Services





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pgsql: Inline CRC computation for small fixed-length input on x86
@ 2025-04-01 06:48  John Naylor <[email protected]>
  parent: Todd Cook <[email protected]>
  0 siblings, 0 replies; 13+ messages in thread

From: John Naylor @ 2025-04-01 06:48 UTC (permalink / raw)
  To: Todd Cook <[email protected]>; +Cc: John Naylor <[email protected]>; [email protected] <[email protected]>

On Tue, Apr 1, 2025 at 1:58 AM Todd Cook <[email protected]> wrote:
>
> On 3/31/25, 7:17 AM, "John Naylor" <[email protected] <mailto:[email protected]>> wrote:
> > A possible workaround would be add a normally-superfluous
> > "pg_attribute_target("sse4.2")" to the inlined function, as in the
> > attached.
>
> The build succeeds on skimmer with this patch and using the configure command from
> the build farm status page[1].

I see bumblebee is now green for the first time. Thanks for testing!

-- 
John Naylor
Amazon Web Services






^ permalink  raw  reply  [nested|flat] 13+ messages in thread


end of thread, other threads:[~2025-04-01 06:48 UTC | newest]

Thread overview: 13+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-03-31 06:28 pgsql: Inline CRC computation for small fixed-length input on x86 John Naylor <[email protected]>
2025-03-31 07:18 ` John Naylor <[email protected]>
2025-03-31 08:07   ` John Naylor <[email protected]>
2025-03-31 08:34     ` Kyotaro Horiguchi <[email protected]>
2025-03-31 08:46       ` John Naylor <[email protected]>
2025-03-31 11:16     ` John Naylor <[email protected]>
2025-03-31 12:12       ` Tom Lane <[email protected]>
2025-03-31 12:52         ` Tom Lane <[email protected]>
2025-03-31 13:09         ` Tom Lane <[email protected]>
2025-03-31 13:33           ` Tom Lane <[email protected]>
2025-04-01 04:20           ` John Naylor <[email protected]>
2025-03-31 18:58       ` Todd Cook <[email protected]>
2025-04-01 06:48         ` John Naylor <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox