public inbox for [email protected]
help / color / mirror / Atom feedFrom: Manni Wood <[email protected]>
To: Nazir Bilal Yavuz <[email protected]>
Cc: KAZAR Ayoub <[email protected]>
Cc: Nathan Bossart <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: Sun, 8 Mar 2026 14:45:44 -0500
Message-ID: <CAKWEB6pJ-5b7QUmVtG12hC0bQ82OvDv4XsidAcnngN36q28qTQ@mail.gmail.com> (raw)
In-Reply-To: <CAN55FZ1k2sOZn1qkDDVTX8Z1t99ZUkywBhdp_43euDHaAi5bRA@mail.gmail.com>
References: <aZ3kYQnF9_u6sUQp@nathan>
<CAN55FZ3+NYF1TkKyNtpRQuLiaauSYk9G5tA+fpruOA4-14Y_ZA@mail.gmail.com>
<aaXrGSyq4u2d9qEC@nathan>
<CAN55FZ2DNaKCK3Kf_kHizb2pAbQvULeDYtzaiz97B_xz7YbrkQ@mail.gmail.com>
<[email protected]>
<CAKWEB6r0CrN-a2P=2ey3EK7p1MxsbQx2C8=hpNGfxLxnRaX66Q@mail.gmail.com>
<CAN55FZ1BYbdt97RLkd-vAVZ5EKiX=KLkhJ_S-4vgvNJRHvQZ6w@mail.gmail.com>
<CAKWEB6pmXvv3sbQBNLAR_B=8wzX8rn4VFsJ4WwmNbuJ2etSwDQ@mail.gmail.com>
<CAN55FZ25HV0FNziFR8xBpQpHkNPS48w8yEXCc91W+=0c0jr+KA@mail.gmail.com>
<CAKWEB6rnMKYGSt=t9=pL2kKUefsAdzKfjtwdqW_acv1+vMTKVA@mail.gmail.com>
<aatfiTMLvLok2cUc@nathan>
<CA+K2RunT=1P7_E4jrZMogDiPunNCOGw9P_UHLaJRJk=5_odKmA@mail.gmail.com>
<CAN55FZ1k2sOZn1qkDDVTX8Z1t99ZUkywBhdp_43euDHaAi5bRA@mail.gmail.com>
On Sun, Mar 8, 2026 at 5:31 AM Nazir Bilal Yavuz <[email protected]> wrote:
> Hi,
>
> On Sat, 7 Mar 2026 at 02:31, KAZAR Ayoub <[email protected]> wrote:
> >
> > On Sat, Mar 7, 2026 at 12:13 AM Nathan Bossart <[email protected]>
> wrote:
> >>
> >> On Fri, Mar 06, 2026 at 03:25:46PM -0600, Manni Wood wrote:
> >> > Well, golly! Look at these numbers. Old master with no lz4, your v11
> patch
> >> > with no lz4, and then your v11 patch with lz4 compiled in.
> >>
> >> I'm appreciative of all the benchmarking that you and others are doing,
> but
> >> wouldn't we be more interested in the difference between "old master
> with
> >> lz4" and "v11 with lz4"? Else, we have multiple variables in play.
> >
> > Yes I agree because the lz4 effect doesn't prove anything for the SIMD
> patch itself right ? So basically a comparison for the SIMD effect should
> be "master with/out lz4 vs patched with/out lz4, respectively and nothing
> more!", is this correct ?
>
> Yes, I think 'master with/out lz4 vs patched with/out lz4,
> respectively' is enough to determine the effect of the SIMD patch.
>
> --
> Regards,
> Nazir Bilal Yavuz
> Microsoft
>
Hello!
As requested, here are some numbers based on the latest master but with the
copy code inlining excised (`git revert
dc592a41557b072178f1798700bf9c69cd8e4235`), compared to master with copy
code inlining left in place and the v11 patch applied.
Both results have lz4 compression in place.
I have not run numbers without lz4. I assume I could use the two postgres
instances that I have compiled with lz4, but just set
`default_toast_compression = pglz` in postgesql.conf for both instances.
Let me know if that is a mistaken assumption on my part.
arm NARROW master without inline with lz4
TXT : 10362.799500 ms
CSV : 10288.791000 ms
TXT with 1/3 escapes: 10411.416250 ms
CSV with 1/3 quotes: 12318.385750 ms
arm NARROW master with inline with lz4 with v11patch
TXT : 10317.125750 ms 0.440747% improvement
CSV : 10418.020250 ms -1.256020% regression
TXT with 1/3 escapes: 10188.319500 ms 2.142809% improvement
CSV with 1/3 quotes: 12032.964500 ms 2.317035% improvement
arm WIDE master without inline with lz4
TXT : 5608.834500 ms
CSV : 8115.155000 ms
TXT with 1/3 escapes: 7037.290500 ms
CSV with 1/3 quotes: 10894.615750 ms
arm WIDE master with inline with lz4 with v11patch
TXT : 3190.268750 ms 43.120647% improvement
CSV : 3135.177000 ms 61.366394% improvement
TXT with 1/3 escapes: 6373.746750 ms 9.428966% improvement
CSV with 1/3 quotes: 10336.763500 ms 5.120440% improvement
x86 NARROW-master-without-inline-with-lz4.log
TXT : 26701.079250 ms
CSV : 26492.235500 ms
TXT with 1/3 escapes: 28590.508250 ms
CSV with 1/3 quotes: 34876.742750 ms
x86 NARROW-master-with-inline-with-lz4-with-v11patch.log
TXT : 26511.747750 ms 0.709078% improvement
CSV : 26261.269750 ms 0.871824% improvement
TXT with 1/3 escapes: 27702.964750 ms 3.104329% improvement
CSV with 1/3 quotes: 32339.393000 ms 7.275191% improvement
x86 WIDE-master-without-inline-with-lz4.log
TXT : 14485.563250 ms
CSV : 21392.582000 ms
TXT with 1/3 escapes: 18081.514750 ms
CSV with 1/3 quotes: 32547.086250 ms
x86 WIDE-master-with-inline-with-lz4-with-v11patch.log
TXT : 8080.378250 ms 44.217714% improvement
CSV : 8283.723000 ms 61.277591% improvement
TXT with 1/3 escapes: 15054.111000 ms 16.743087% improvement
CSV with 1/3 quotes: 25668.009750 ms 21.135768% improvement
--
-- Manni Wood EDB: https://www.enterprisedb.com
view thread (114+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
In-Reply-To: <CAKWEB6pJ-5b7QUmVtG12hC0bQ82OvDv4XsidAcnngN36q28qTQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox