public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andrew Kim <[email protected]>
To: John Naylor <[email protected]>
Cc: [email protected]
Cc: Oleg Tselebrovskiy <[email protected]>
Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
Date: Mon, 17 Nov 2025 19:32:01 -0800
Message-ID: <CAK64mne_oWN9d4mf+0c_5-4Emb9kRXA-OC05OJ4F_1fVqpjzDA@mail.gmail.com> (raw)
In-Reply-To: <CANWCAZY940P3wGOQAZWMLQL4MQGGyOu7WBjBEcn_gqcrr+NvAw@mail.gmail.com>
References: <[email protected]>
<CANWCAZYZQw-nzTXbx3Bk332VtY9_D7ksDsuMZ0A-iDZ53yG7Ng@mail.gmail.com>
<CAK64mnfeWLBRbMfnOsag0vGTDnT84KJzpuei40nG0OHyw4SESw@mail.gmail.com>
<CANWCAZa1b2rcvoK657SmcKwh2P2cgASQ1D-0JPj5d3LbfaAVgA@mail.gmail.com>
<CAK64mneN20+sW5WhV+r7hMVo4Rd0z11B6=3L039rWMt1wK3nPg@mail.gmail.com>
<CANWCAZZuS3sNgLRo8Z4AM=uY4zTmz=dH5D4Z9xV6K0CEuJ8Hdw@mail.gmail.com>
<CAK64mnejn9AZMYz03e7HX8Uui35PihUuOy=b+iBG=YtRKx0Log@mail.gmail.com>
<CANWCAZZ_0AQMk1HgHXHX+JaeBfy_4kzwHgTdqMptDA7zM+nm+Q@mail.gmail.com>
<CAK64mnc6jbehHv5AHc84tVFRJg4zeMiFuvPX9xZkRpq0210MFA@mail.gmail.com>
<CANWCAZY940P3wGOQAZWMLQL4MQGGyOu7WBjBEcn_gqcrr+NvAw@mail.gmail.com>
Hi John,
Thanks for taking the time to review the v9-0001 refactoring patch and
for setting the CF entry to Needs Review.
On Fri, Nov 14, 2025 at 3:34 AM John Naylor <[email protected]> wrote:
>
> On Thu, Nov 6, 2025 at 6:50 AM Andrew Kim <[email protected]> wrote:
> > The v9 patch series is attached.
>
> Thanks! BTW, I've set the commitfest entry to "Needs Review".
>
> I spent some time with the v9-0001 refactoring patch, and I have one
> observation that's worth sharing now:
>
> --- a/src/bin/pg_checksums/pg_checksums.c
> +++ b/src/bin/pg_checksums/pg_checksums.c
> @@ -29,8 +29,7 @@
> #include "getopt_long.h"
> #include "pg_getopt.h"
> #include "storage/bufpage.h"
> -#include "storage/checksum.h"
> -#include "storage/checksum_impl.h"
> +#include "port/checksum.h"
>
> Outside the backend it is no longer required to include the
> implementation header -- it just links and works correctly. That seems
> like a good thing. In fact...
>
> On Tue, Oct 21, 2025 I wrote:
> > Looking at commit f04216341dd1, we have at least one example of an
> > external program, pg_filedump. If we can keep this working with
> > minimal fuss, it should be fine everywhere.
>
> > https://github.com/df7cb/pg_filedump/blob/master/pg_filedump.c#L29
> >
> > ```
> > /* checksum_impl.h uses Assert, which doesn't work outside the server */
> > #undef Assert
> > #define Assert(X)
> >
> > #include "storage/checksum.h"
> > #include "storage/checksum_impl.h"
> > ```
>
> I tried building pg_filedump against the server with just the 0001
> patch and it also builds fine leaving out the implementation:
>
> diff --git a/pg_filedump.c b/pg_filedump.c
> index 606a85b..0268381 100644
> --- a/pg_filedump.c
> +++ b/pg_filedump.c
> @@ -26,12 +26,7 @@
>
> #include <utils/pg_crc.h>
>
> -/* checksum_impl.h uses Assert, which doesn't work outside the server */
> -#undef Assert
> -#define Assert(X)
> -
> -#include "storage/checksum.h"
> -#include "storage/checksum_impl.h"
> +#include "port/checksum.h"
> #include "decode.h"
> #include <inttypes.h>
>
> I verified that it does in fact break when built against our master
> branch without the impl.h header.
>
> Further, if I replace the above CRC #include to point instead to our
> hardware-specific API in port/pg_crc32c.h, it builds fine after
> adjusting the typedef that it expects, and interprets the control file
> normally:
>
> <pg_control Contents> *********************************************
>
> CRC: Correct
> pg_control Version: 1800
> Catalog Version: 202511051
> ...
>
> I'll think more about this.
>
I've double-checked everything after applying the v9 checksum patches
and updating pg_filedump accordingly.
Following your suggestion, I removed the checksum_impl.h include and
the Assert redefinition, keeping only the port/checksum.h include.
build compiles cleanly with the new architecture, and pg_filedump
functions correctly with the AVX2 optimizations.
If you agree with this approach, I'd like to prepare a patch for
upstream submission.
Please feel free to guide me on the proper process for this. Should I
submit it to the pg_filedump repository, or would you prefer to handle
the integration as part of the v9 checksum patch series?
Thanks again for your testing and guidance!
> --
> John Naylor
> Amazon Web Services
view thread (35+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
In-Reply-To: <CAK64mne_oWN9d4mf+0c_5-4Emb9kRXA-OC05OJ4F_1fVqpjzDA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox