public inbox for [email protected]  
help / color / mirror / Atom feed
From: Nathan Bossart <[email protected]>
To: KAZAR Ayoub <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Pg Hackers <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Manni Wood <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: Mark Wong <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Subject: Re: Speed up COPY TO text/CSV parsing using SIMD
Date: Tue, 31 Mar 2026 11:30:54 -0500
Message-ID: <acv2vu8miagnHG1B@nathan> (raw)
In-Reply-To: <CA+K2Runq+1gy8p6a-DsxpT2OkEkEu3cUGsZ9tdiGNrg_=P39gg@mail.gmail.com>
References: <CA+K2Runi_H2CBL0yMm3De2KqcR9RMA0HK5cLJjEhoNszC7myeg@mail.gmail.com>
	<[email protected]>
	<CA+K2Rum_QTZqTUrdMOL5hr-OOpCwGR_9Nj1z15BFObjktMOY6A@mail.gmail.com>
	<abBuKalOno33MQFw@nathan>
	<CA+K2Rum7+Jm2rm65K5msxaiAM8QTkhSNAYarPBP9O7nBXYo12Q@mail.gmail.com>
	<abmiNPQOqBrRlf_m@nathan>
	<CA+K2Rum-TB_iNzDWoXOJspf=jq0gd-wees8+9tBTJNyhy9cK5g@mail.gmail.com>
	<CA+K2Ru=PdZuXQbcfvqKysTkebTyXNd9j7dp+mTFQEYpLdGw1eA@mail.gmail.com>
	<acWj5FntidHJ9nVP@nathan>
	<CA+K2Runq+1gy8p6a-DsxpT2OkEkEu3cUGsZ9tdiGNrg_=P39gg@mail.gmail.com>

On Fri, Mar 27, 2026 at 07:48:38PM +0100, KAZAR Ayoub wrote:
> I added a prescan loop inside the simd helpers trying to catch special
> chars in sizeof(Vector8) characters, i measured how good is this at
> reducing the overhead of starting simd and exiting at first vector:
> the scalar loop is better than SIMD for one vector if it finds a special
> character before 6th character, worst case is not a clean vector, where the
> scalar loop needs 20 more cycles compared to SIMD.
> This helps mitigate the case of JSON(B) in CSV format, this is why I only
> added this for CSV case only.

Interesting.

> In a benchmark with 10M early SIMD exit like the JSONB case, the previous
> 3% regression is gone.

While these are nice results, I think it's best that we target v20 for this
patch so that we have more time to benchmark and explore edge cases.

-- 
nathan





view thread (13+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Speed up COPY TO text/CSV parsing using SIMD
  In-Reply-To: <acv2vu8miagnHG1B@nathan>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox