public inbox for [email protected]
help / color / mirror / Atom feedFrom: Manni Wood <[email protected]>
To: Nazir Bilal Yavuz <[email protected]>
Cc: KAZAR Ayoub <[email protected]>
Cc: Nathan Bossart <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: Mon, 9 Mar 2026 08:31:39 -0500
Message-ID: <CAKWEB6pWU+2mUa41t0Tb+3XmKyDuGSS=8XaDirumjSc9=d8WJQ@mail.gmail.com> (raw)
In-Reply-To: <CAN55FZ1pM3PgKfAuTT5YmG4enCoh93bOjxNi95nqKgwbHmh3hg@mail.gmail.com>
References: <aZ3kYQnF9_u6sUQp@nathan>
<CAN55FZ3+NYF1TkKyNtpRQuLiaauSYk9G5tA+fpruOA4-14Y_ZA@mail.gmail.com>
<aaXrGSyq4u2d9qEC@nathan>
<CAN55FZ2DNaKCK3Kf_kHizb2pAbQvULeDYtzaiz97B_xz7YbrkQ@mail.gmail.com>
<[email protected]>
<CAKWEB6r0CrN-a2P=2ey3EK7p1MxsbQx2C8=hpNGfxLxnRaX66Q@mail.gmail.com>
<CAN55FZ1BYbdt97RLkd-vAVZ5EKiX=KLkhJ_S-4vgvNJRHvQZ6w@mail.gmail.com>
<CAKWEB6pmXvv3sbQBNLAR_B=8wzX8rn4VFsJ4WwmNbuJ2etSwDQ@mail.gmail.com>
<CAN55FZ25HV0FNziFR8xBpQpHkNPS48w8yEXCc91W+=0c0jr+KA@mail.gmail.com>
<CAKWEB6rnMKYGSt=t9=pL2kKUefsAdzKfjtwdqW_acv1+vMTKVA@mail.gmail.com>
<aatfiTMLvLok2cUc@nathan>
<CA+K2RunT=1P7_E4jrZMogDiPunNCOGw9P_UHLaJRJk=5_odKmA@mail.gmail.com>
<CAN55FZ1k2sOZn1qkDDVTX8Z1t99ZUkywBhdp_43euDHaAi5bRA@mail.gmail.com>
<CAKWEB6pJ-5b7QUmVtG12hC0bQ82OvDv4XsidAcnngN36q28qTQ@mail.gmail.com>
<CAN55FZ1pM3PgKfAuTT5YmG4enCoh93bOjxNi95nqKgwbHmh3hg@mail.gmail.com>
On Mon, Mar 9, 2026 at 3:10 AM Nazir Bilal Yavuz <[email protected]> wrote:
> Hi,
>
> On Sun, 8 Mar 2026 at 22:45, Manni Wood <[email protected]>
> wrote:
> >
> > As requested, here are some numbers based on the latest master but with
> the copy code inlining excised (`git revert
> dc592a41557b072178f1798700bf9c69cd8e4235`), compared to master with copy
> code inlining left in place and the v11 patch applied.
> > Both results have lz4 compression in place.
>
> Thank you for the benchmark!
>
> > I have not run numbers without lz4. I assume I could use the two
> postgres instances that I have compiled with lz4, but just set
> `default_toast_compression = pglz` in postgesql.conf for both instances.
> Let me know if that is a mistaken assumption on my part.
>
> I am a bit confused. Are you asking that for the current benchmark you
> shared or future benchmarks? I assume your current benchmark has
> 'default_toast_compression = lz4' because your benchmark results are
> very similar to my benchmark with 'default_toast_compression = lz4'
> but I just wanted to make sure.
>
> What you said about editing postgresql.conf is correct but you need to
> make this change before creating the Postgres instance with 'pg_ctl
> ... start' command, otherwise it won't have an effect and you need to
> restart the instance to see the effect. Also, If you want to benchmark
> without lz4 change, you can just use the "SET
> default_toast_compression to 'pglz';" command in psql, then you don't
> need to edit postgresql.conf. Please note that this will affect only
> the psql instance you typed the command. To make things easier, you
> can run the 'SHOW default_toast_compression;' command to see the
> current value of 'default_toast_compression'.
>
> --
> Regards,
> Nazir Bilal Yavuz
> Microsoft
>
Hello, Nazir!
I was being too brief.
The benchmarks I shared were absolutely with lz4 compiled in
and 'default_toast_compression = lz4' set in postgresql.conf for every
postgres instance I tested with. (Furthermore, I ran `show
default_toast_compression` via `psql` on each postgres instance to be
sure 'default_toast_compression = lz4' was really set!)
Also, all were compiled using meson using `debugoptimized` which results in
`-g -O2`.
So those are the benchmarks that I shared.
OK, so my final question, hopefully clarified: If I run additional
benchmarks where pglz is used for default_toast_compression, is it enough
to use the instances I have already compiled with lz4 in them, but
with 'default_toast_compression = pglz` explicitly set in postgresql.conf
in a brand new data dir created by initdb? (In other words, existing data
dir deleted, then initdb run to make a new data dir, then postgresql.conf
edited to ensure 'default_toast_compression = pglz` explicitly set, then
and only then starting up the cluster for the first time... and finally
verifying via `show default_toast_compression` for good measure.)
Or should I re-compile with the lz4-is-now-the-default commit completely
excised?
Thanks so much!
-Manni
--
-- Manni Wood EDB: https://www.enterprisedb.com
view thread (114+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
In-Reply-To: <CAKWEB6pWU+2mUa41t0Tb+3XmKyDuGSS=8XaDirumjSc9=d8WJQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox