public inbox for [email protected]
help / color / mirror / Atom feedFrom: KAZAR Ayoub <[email protected]>
To: Nathan Bossart <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: Manni Wood <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: Wed, 26 Nov 2025 12:50:58 +0100
Message-ID: <CA+K2Rump8NoMRZRZ2r4jHXUJwByasy_c3_b0oaO+TLkSbMD-jw@mail.gmail.com> (raw)
In-Reply-To: <aR4wDwNdLc5TmcQq@nathan>
References: <aPkvi5P7kpA8oQKc@nathan>
<[email protected]>
<CAKWEB6qdyhN3EoUNAK23etXX-kXH-_79NNbTsKqtF1g1WkuaBQ@mail.gmail.com>
<CA+K2RumMC+avYGSX-AWNeod3w+XOGHrVPz8HiqkvJj7AZ5tZXA@mail.gmail.com>
<CAKWEB6pev=pNVi4qDYWS50N=YFrKRbjH1h=5F1bXpnK7WR5CYg@mail.gmail.com>
<aRue0D4QQkUf2B_N@nathan>
<CAOzEurTHCGL-Txqf5rxMsPgTF=dTCOsr=uhJdXebqjEJy-0L7g@mail.gmail.com>
<CAN55FZ0+JZvKYVCnJqLhHaWF9eBGmTaF1BCEpttxw1aT3G_+Qw@mail.gmail.com>
<[email protected]>
<CAN55FZ1XF=R7F7B__gq04rp2nQnJqs1yfExEXo4riWc68+Pe0w@mail.gmail.com>
<aR4wDwNdLc5TmcQq@nathan>
Hello,
On Wed, Nov 19, 2025 at 10:01 PM Nathan Bossart <[email protected]>
wrote:
> On Tue, Nov 18, 2025 at 05:20:05PM +0300, Nazir Bilal Yavuz wrote:
> > Thanks, done.
>
> I took a look at the v3 patches. Here are my high-level thoughts:
>
> + /*
> + * Parse data and transfer into line_buf. To get benefit from
> inlining,
> + * call CopyReadLineText() with the constant boolean variables.
> + */
> + if (cstate->simd_continue)
> + result = CopyReadLineText(cstate, is_csv, true);
> + else
> + result = CopyReadLineText(cstate, is_csv, false);
>
> I'm curious whether this actually generates different code, and if it does,
> if it's actually faster. We're already branching on cstate->simd_continue
> here.
I've compiled both versions with -O2 and confirmed they generate different
code. When simd_continue is passed as a constant to CopyReadLineText, the
compiler optimizes out the condition checks from the SIMD path.
A small benchmark on a 1GB+ file shows the expected benefit which is around
6% performance improvement.
I've attached the assembly outputs in case someone wants to check something
else.
Regards,
Ayoub Kazar
Attachments:
[application/octet-stream] copyfromparse-constant.asm (48.0K, 3-copyfromparse-constant.asm)
download
[application/octet-stream] copyfromparse-variable.asm (47.1K, 4-copyfromparse-variable.asm)
download
view thread (99+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
In-Reply-To: <CA+K2Rump8NoMRZRZ2r4jHXUJwByasy_c3_b0oaO+TLkSbMD-jw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox