public inbox for [email protected]
help / color / mirror / Atom feedFrom: KAZAR Ayoub <[email protected]>
To: Nathan Bossart <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Pg Hackers <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Manni Wood <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: Mark Wong <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Subject: Re: Speed up COPY TO text/CSV parsing using SIMD
Date: Thu, 2 Apr 2026 20:07:38 +0200
Message-ID: <CA+K2Ru=JK5NUEaxA77pCEer40QnV1TMxeg68Et9RL0zMZw_Jyw@mail.gmail.com> (raw)
In-Reply-To: <acv2vu8miagnHG1B@nathan>
References: <CA+K2Runi_H2CBL0yMm3De2KqcR9RMA0HK5cLJjEhoNszC7myeg@mail.gmail.com>
<[email protected]>
<CA+K2Rum_QTZqTUrdMOL5hr-OOpCwGR_9Nj1z15BFObjktMOY6A@mail.gmail.com>
<abBuKalOno33MQFw@nathan>
<CA+K2Rum7+Jm2rm65K5msxaiAM8QTkhSNAYarPBP9O7nBXYo12Q@mail.gmail.com>
<abmiNPQOqBrRlf_m@nathan>
<CA+K2Rum-TB_iNzDWoXOJspf=jq0gd-wees8+9tBTJNyhy9cK5g@mail.gmail.com>
<CA+K2Ru=PdZuXQbcfvqKysTkebTyXNd9j7dp+mTFQEYpLdGw1eA@mail.gmail.com>
<acWj5FntidHJ9nVP@nathan>
<CA+K2Runq+1gy8p6a-DsxpT2OkEkEu3cUGsZ9tdiGNrg_=P39gg@mail.gmail.com>
<acv2vu8miagnHG1B@nathan>
On Tue, Mar 31, 2026 at 6:30 PM Nathan Bossart <[email protected]>
wrote:
> On Fri, Mar 27, 2026 at 07:48:38PM +0100, KAZAR Ayoub wrote:
> > I added a prescan loop inside the simd helpers trying to catch special
> > chars in sizeof(Vector8) characters, i measured how good is this at
> > reducing the overhead of starting simd and exiting at first vector:
> > the scalar loop is better than SIMD for one vector if it finds a special
> > character before 6th character, worst case is not a clean vector, where
> the
> > scalar loop needs 20 more cycles compared to SIMD.
> > This helps mitigate the case of JSON(B) in CSV format, this is why I only
> > added this for CSV case only.
>
> Interesting.
>
> > In a benchmark with 10M early SIMD exit like the JSONB case, the previous
> > 3% regression is gone.
>
> While these are nice results, I think it's best that we target v20 for this
> patch so that we have more time to benchmark and explore edge cases.
>
Thanks for the review.
Fair enough, I'll try many more cases in the upcoming weeks to make sure
we're not missing anything.
>
> --
> nathan
Regards,
Ayoub
view thread (13+ messages)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Speed up COPY TO text/CSV parsing using SIMD
In-Reply-To: <CA+K2Ru=JK5NUEaxA77pCEer40QnV1TMxeg68Et9RL0zMZw_Jyw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox