public inbox for [email protected]  
help / color / mirror / Atom feed
From: Andrew Kim <[email protected]>
To: John Naylor <[email protected]>
Cc: [email protected]
Cc: Oleg Tselebrovskiy <[email protected]>
Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
Date: Mon, 12 Jan 2026 17:58:07 -0800
Message-ID: <CAK64mnd9NE+xE18shrf-SSx-iwMVof=2DJ2y9_fOkQ5E2Abc5g@mail.gmail.com> (raw)
In-Reply-To: <CANWCAZYg2MVbYTaczNYNC2kaPodtfB8toUfE2Mhp9kut=2wzEA@mail.gmail.com>
References: <[email protected]>
	<CANWCAZYZQw-nzTXbx3Bk332VtY9_D7ksDsuMZ0A-iDZ53yG7Ng@mail.gmail.com>
	<CAK64mnfeWLBRbMfnOsag0vGTDnT84KJzpuei40nG0OHyw4SESw@mail.gmail.com>
	<CANWCAZa1b2rcvoK657SmcKwh2P2cgASQ1D-0JPj5d3LbfaAVgA@mail.gmail.com>
	<CAK64mneN20+sW5WhV+r7hMVo4Rd0z11B6=3L039rWMt1wK3nPg@mail.gmail.com>
	<CANWCAZZuS3sNgLRo8Z4AM=uY4zTmz=dH5D4Z9xV6K0CEuJ8Hdw@mail.gmail.com>
	<CAK64mnejn9AZMYz03e7HX8Uui35PihUuOy=b+iBG=YtRKx0Log@mail.gmail.com>
	<CANWCAZZ_0AQMk1HgHXHX+JaeBfy_4kzwHgTdqMptDA7zM+nm+Q@mail.gmail.com>
	<CAK64mnc6jbehHv5AHc84tVFRJg4zeMiFuvPX9xZkRpq0210MFA@mail.gmail.com>
	<CANWCAZY940P3wGOQAZWMLQL4MQGGyOu7WBjBEcn_gqcrr+NvAw@mail.gmail.com>
	<CAK64mne_oWN9d4mf+0c_5-4Emb9kRXA-OC05OJ4F_1fVqpjzDA@mail.gmail.com>
	<CANWCAZZcKYp+01u1QmkShfXVkUCCdxtJAgHT-61Vw0ALoWj47A@mail.gmail.com>
	<CAK64mne=Q_4VSpJ8f4RQB-yAThd4+i-BRYMvfdGOhvwJQdYoKQ@mail.gmail.com>
	<CANWCAZYg2MVbYTaczNYNC2kaPodtfB8toUfE2Mhp9kut=2wzEA@mail.gmail.com>

Hi John,
Thanks for taking the time to dig into this,
I really appreciate the detailed analysis, especially catching the
Meson CI failure, which I had unfortunately missed after v6.

On Sun, Jan 11, 2026 at 3:19 PM John Naylor <[email protected]> wrote:
>
> On Thu, Nov 6, 2025 at 6:50 AM Andrew Kim <[email protected]> wrote:
> > The v9 patch series is attached.
>
> Sorry for the delay. I found some issues last month and needed to
> consider the tradeoffs.
>
> First, apparently it has gone unnoticed by everyone, myself included,
> that no version has passed Meson CI since v6:
>
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5726
>
> That's because `ninja -C build -t missingdeps` gives:
>
> Missing dep: src/port/libpgport_shlib_checksum.a.p/checksum.c.o uses
> src/include/utils/errcodes.h (generated by CUSTOM_COMMAND)
> Missing dep: src/port/libpgport_checksum.a.p/checksum.c.o uses
> src/include/utils/errcodes.h (generated by CUSTOM_COMMAND)
> Processed 2561 nodes.
> Error: There are 2 missing dependency paths.
> 2 targets had depfile dependencies on 1 distinct generated inputs
> (from 1 rules)  without a non-depfile dep path to the generator.
> There might be build flakiness if any of the targets listed above are
> built alone, or not late enough, in a clean output directory.
>
> In the back of my mind I was worried of consequences of something in
> src/port depending on backend types, but hadn't seen any in my local
> builds. It seems the proximate cause is the removal of this stanza
> with no equivalent replacement:
>
> --- a/src/backend/storage/page/meson.build
> +++ b/src/backend/storage/page/meson.build
> @@ -1,14 +1,5 @@
>  # Copyright (c) 2022-2025, PostgreSQL Global Development Group
>
> -checksum_backend_lib = static_library('checksum_backend_lib',
> -  'checksum.c',
> -  dependencies: backend_build_deps,
> -  kwargs: internal_lib_args,
> -  c_args: vectorize_cflags + unroll_loops_cflags,
> -)
> -
> -backend_link_with += checksum_backend_lib
>
> The low-level algorithm doesn't care about database pages, only
> integers, so first I tried to surgically isolate the concepts, but
> that was too messy.
>
> In the attached v10-0003, I went back to something more similar to v6,
> but incorporated Andrew's idea of using PG_CHECKSUM_INTERNAL to allow
> for flexibility. Now pg_filedump compiles without any changes, so
> that's a plus.
>
> > - Provides public interfaces wrapping the basic implementation
>
> > - No code duplication (checksum.c includes checksum_impl.h)
>
> Upthread I mentioned "thin wrappers", but so far I haven't seen it in
> any patch versions, so I don't think this term means the same thing to
> you as it does to me (I saw pretty clear duplication in v9). It then
> occurred to me that with function attribute targets, doing the naive
> thing throws a compiler error IIRC -- namely just have a notional
> function call that then gets inlined and re-targeted. So in v10 I
> separated the body of checksum_block to a semi-private header to
> provide hardware-specific definitions for core code, while also
> maintaining the same one that external code expects.

I agree that the missing dependency reported by Meson is a real issue,
not just a theoretical one.
The removal of the backend-side checksum_backend_lib stanza without an
equivalent dependency path explains the CI breakage clearly,
and your diagnosis makes sense, v10-0003 approach,
splitting the body of checksum_block into a semi-private
implementation header while preserving the externally visible
interface,
that makes sense to me


>
> For this to be commitable, I think (and I think Oleg agrees) that the
> feature detection should go in src/port. Some of us have been thinking
> of refactoring and centralizing the feature detection, and now may be
> a good time to do it. Before going that far, I wanted to see what
> people think of v10.

I also agree with you (and Oleg) that feature detection really belongs
in src/port,
even if that means doing a bit more refactoring up front. As you said,
this may actually be a good forcing function to finally consolidate
feature detection in a cleaner way.

I’m supportive of using v10 as the basis for further discussion and iteration,
cleaning up Meson dependency declarations so generated headers are
properly ordered,
refining the PG_CHECKSUM_INTERNAL usage if needed,
and assisting with any additional refactoring required to keep
src/port fully backend-agnostic.
Thanks again for the careful review and for pushing this in a
direction that’s more robust and committable.

>
> --
> John Naylor
> Amazon Web Services






view thread (15+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
  In-Reply-To: <CAK64mnd9NE+xE18shrf-SSx-iwMVof=2DJ2y9_fOkQ5E2Abc5g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox