public inbox for [email protected]  
help / color / mirror / Atom feed
From: John Naylor <[email protected]>
To: Ants Aasma <[email protected]>
Cc: Andrew Kim <[email protected]>
Cc: Oleg Tselebrovskiy <[email protected]>
Cc: [email protected]
Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
Date: Tue, 31 Mar 2026 11:09:26 +0700
Message-ID: <CANWCAZYrjnCCE6m=5oRs+Ok=sgMrdf33xM25Fxy3yp=kQAoNwA@mail.gmail.com> (raw)
In-Reply-To: <CANwKhkMN31RoNab8ovJjZaW=o6CNHCu-rznk85wKO=L5z5-PSA@mail.gmail.com>
References: <[email protected]>
	<CANWCAZYZQw-nzTXbx3Bk332VtY9_D7ksDsuMZ0A-iDZ53yG7Ng@mail.gmail.com>
	<CAK64mnfeWLBRbMfnOsag0vGTDnT84KJzpuei40nG0OHyw4SESw@mail.gmail.com>
	<CANWCAZa1b2rcvoK657SmcKwh2P2cgASQ1D-0JPj5d3LbfaAVgA@mail.gmail.com>
	<CAK64mneN20+sW5WhV+r7hMVo4Rd0z11B6=3L039rWMt1wK3nPg@mail.gmail.com>
	<CANWCAZZuS3sNgLRo8Z4AM=uY4zTmz=dH5D4Z9xV6K0CEuJ8Hdw@mail.gmail.com>
	<CAK64mnejn9AZMYz03e7HX8Uui35PihUuOy=b+iBG=YtRKx0Log@mail.gmail.com>
	<CANWCAZZ_0AQMk1HgHXHX+JaeBfy_4kzwHgTdqMptDA7zM+nm+Q@mail.gmail.com>
	<CAK64mnc6jbehHv5AHc84tVFRJg4zeMiFuvPX9xZkRpq0210MFA@mail.gmail.com>
	<CANWCAZY940P3wGOQAZWMLQL4MQGGyOu7WBjBEcn_gqcrr+NvAw@mail.gmail.com>
	<CAK64mne_oWN9d4mf+0c_5-4Emb9kRXA-OC05OJ4F_1fVqpjzDA@mail.gmail.com>
	<CANWCAZZcKYp+01u1QmkShfXVkUCCdxtJAgHT-61Vw0ALoWj47A@mail.gmail.com>
	<CAK64mne=Q_4VSpJ8f4RQB-yAThd4+i-BRYMvfdGOhvwJQdYoKQ@mail.gmail.com>
	<CANWCAZYg2MVbYTaczNYNC2kaPodtfB8toUfE2Mhp9kut=2wzEA@mail.gmail.com>
	<CAK64mnd9NE+xE18shrf-SSx-iwMVof=2DJ2y9_fOkQ5E2Abc5g@mail.gmail.com>
	<CANWCAZbjdFnBiUmrBQC5vFFy0Fnn4SJG4AkkzGpTFhovodJdYQ@mail.gmail.com>
	<[email protected]>
	<CANWCAZZJ1tQcwWZe4BTgv1E-+bvhe4d0LzJvXeZCFMjRtWpk-w@mail.gmail.com>
	<CAK64mnfwyr-6GMRFFW_3a+xXpJxpYymCOygfbr-HUA_7+tQk2Q@mail.gmail.com>
	<CAK64mndS2Oy1i9ehEALwyv5EBpjMozTxSkt8HYc17a+MKDvmdQ@mail.gmail.com>
	<CANWCAZbigzGc_Kzsqf3NB+FgfnvJCas_KovCXg3GROJTVjuS9Q@mail.gmail.com>
	<CANWCAZZ49dJ7XR1dY==7cHs93H7huo9f6RA_2qevFLp9eaOk4g@mail.gmail.com>
	<CANwKhkMN31RoNab8ovJjZaW=o6CNHCu-rznk85wKO=L5z5-PSA@mail.gmail.com>

On Mon, Mar 30, 2026 at 10:01 PM Ants Aasma <[email protected]> wrote:
>
> On Mon, 30 Mar 2026 at 15:01, John Naylor <[email protected]> wrote:
> > I don't remember the last time anyone did measurements, so I went
> > ahead and did that:
> >
> > master: 945ms
> > 32 AVX2: 335ms
> > 64 AVX2: 220ms
>
> I'm guessing this is on a recent Intel. Any extra width is helpful on Intel as they doubled vpmulld latency from under us after we had settled on this algorithm.

It's actually ancient and due to be replaced soon, but still several
years after the adoption of this algorithm.

> FWIW I think AVX2 (x86-64-v3) is fine.

Glad to hear it, although the patch doesn't use that build flag, so
it's not impossible there is some additional difference in the
compiler's model. Still, given the variation you found, I'll make sure
the commit message says "several time faster" so it's not specific to
my hardware.

--
John Naylor
Amazon Web Services





view thread (42+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
  In-Reply-To: <CANWCAZYrjnCCE6m=5oRs+Ok=sgMrdf33xM25Fxy3yp=kQAoNwA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox