public inbox for [email protected]  
help / color / mirror / Atom feed
From: Manni Wood <[email protected]>
To: Nazir Bilal Yavuz <[email protected]>
Cc: KAZAR Ayoub <[email protected]>
Cc: Nathan Bossart <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: Sun, 8 Mar 2026 14:45:44 -0500
Message-ID: <CAKWEB6pJ-5b7QUmVtG12hC0bQ82OvDv4XsidAcnngN36q28qTQ@mail.gmail.com> (raw)
In-Reply-To: <CAN55FZ1k2sOZn1qkDDVTX8Z1t99ZUkywBhdp_43euDHaAi5bRA@mail.gmail.com>
References: <aZ3kYQnF9_u6sUQp@nathan>
	<CAN55FZ3+NYF1TkKyNtpRQuLiaauSYk9G5tA+fpruOA4-14Y_ZA@mail.gmail.com>
	<aaXrGSyq4u2d9qEC@nathan>
	<CAN55FZ2DNaKCK3Kf_kHizb2pAbQvULeDYtzaiz97B_xz7YbrkQ@mail.gmail.com>
	<[email protected]>
	<CAKWEB6r0CrN-a2P=2ey3EK7p1MxsbQx2C8=hpNGfxLxnRaX66Q@mail.gmail.com>
	<CAN55FZ1BYbdt97RLkd-vAVZ5EKiX=KLkhJ_S-4vgvNJRHvQZ6w@mail.gmail.com>
	<CAKWEB6pmXvv3sbQBNLAR_B=8wzX8rn4VFsJ4WwmNbuJ2etSwDQ@mail.gmail.com>
	<CAN55FZ25HV0FNziFR8xBpQpHkNPS48w8yEXCc91W+=0c0jr+KA@mail.gmail.com>
	<CAKWEB6rnMKYGSt=t9=pL2kKUefsAdzKfjtwdqW_acv1+vMTKVA@mail.gmail.com>
	<aatfiTMLvLok2cUc@nathan>
	<CA+K2RunT=1P7_E4jrZMogDiPunNCOGw9P_UHLaJRJk=5_odKmA@mail.gmail.com>
	<CAN55FZ1k2sOZn1qkDDVTX8Z1t99ZUkywBhdp_43euDHaAi5bRA@mail.gmail.com>

On Sun, Mar 8, 2026 at 5:31 AM Nazir Bilal Yavuz <[email protected]> wrote:

> Hi,
>
> On Sat, 7 Mar 2026 at 02:31, KAZAR Ayoub <[email protected]> wrote:
> >
> > On Sat, Mar 7, 2026 at 12:13 AM Nathan Bossart <[email protected]>
> wrote:
> >>
> >> On Fri, Mar 06, 2026 at 03:25:46PM -0600, Manni Wood wrote:
> >> > Well, golly! Look at these numbers. Old master with no lz4, your v11
> patch
> >> > with no lz4, and then your v11 patch with lz4 compiled in.
> >>
> >> I'm appreciative of all the benchmarking that you and others are doing,
> but
> >> wouldn't we be more interested in the difference between "old master
> with
> >> lz4" and "v11 with lz4"?  Else, we have multiple variables in play.
> >
> > Yes I agree because the lz4 effect doesn't prove anything for the SIMD
> patch itself right ? So basically a comparison for the SIMD effect should
> be "master with/out lz4 vs patched with/out lz4, respectively and nothing
> more!", is this correct ?
>
> Yes, I think 'master with/out lz4 vs patched with/out lz4,
> respectively' is enough to determine the effect of the SIMD patch.
>
> --
> Regards,
> Nazir Bilal Yavuz
> Microsoft
>

Hello!

As requested, here are some numbers based on the latest master but with the
copy code inlining excised (`git revert
dc592a41557b072178f1798700bf9c69cd8e4235`), compared to master with copy
code inlining left in place and the v11 patch applied.
Both results have lz4 compression in place.

I have not run numbers without lz4. I assume I could use the two postgres
instances that I have compiled with lz4, but just set
`default_toast_compression = pglz` in postgesql.conf for both instances.
Let me know if that is a mistaken assumption on my part.

arm NARROW master without inline with lz4
TXT :                 10362.799500 ms
CSV :                 10288.791000 ms
TXT with 1/3 escapes: 10411.416250 ms
CSV with 1/3 quotes:  12318.385750 ms

arm NARROW master with inline with lz4 with v11patch
TXT :                 10317.125750 ms  0.440747% improvement
CSV :                 10418.020250 ms -1.256020% regression
TXT with 1/3 escapes: 10188.319500 ms  2.142809% improvement
CSV with 1/3 quotes:  12032.964500 ms  2.317035% improvement


arm WIDE master without inline with lz4
TXT :                  5608.834500 ms
CSV :                  8115.155000 ms
TXT with 1/3 escapes:  7037.290500 ms
CSV with 1/3 quotes:  10894.615750 ms

arm WIDE master with inline with lz4 with v11patch
TXT :                  3190.268750 ms  43.120647% improvement
CSV :                  3135.177000 ms  61.366394% improvement
TXT with 1/3 escapes:  6373.746750 ms   9.428966% improvement
CSV with 1/3 quotes:  10336.763500 ms   5.120440% improvement



x86 NARROW-master-without-inline-with-lz4.log
TXT :                 26701.079250 ms
CSV :                 26492.235500 ms
TXT with 1/3 escapes: 28590.508250 ms
CSV with 1/3 quotes:  34876.742750 ms

x86 NARROW-master-with-inline-with-lz4-with-v11patch.log
TXT :                 26511.747750 ms  0.709078% improvement
CSV :                 26261.269750 ms  0.871824% improvement
TXT with 1/3 escapes: 27702.964750 ms  3.104329% improvement
CSV with 1/3 quotes:  32339.393000 ms  7.275191% improvement


x86 WIDE-master-without-inline-with-lz4.log
TXT :                 14485.563250 ms
CSV :                 21392.582000 ms
TXT with 1/3 escapes: 18081.514750 ms
CSV with 1/3 quotes:  32547.086250 ms

x86 WIDE-master-with-inline-with-lz4-with-v11patch.log
TXT :                  8080.378250 ms  44.217714% improvement
CSV :                  8283.723000 ms  61.277591% improvement
TXT with 1/3 escapes: 15054.111000 ms  16.743087% improvement
CSV with 1/3 quotes:  25668.009750 ms  21.135768% improvement
-- 
-- Manni Wood EDB: https://www.enterprisedb.com


view thread (114+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
  In-Reply-To: <CAKWEB6pJ-5b7QUmVtG12hC0bQ82OvDv4XsidAcnngN36q28qTQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox