public inbox for [email protected]  
help / color / mirror / Atom feed
From: Nathan Bossart <[email protected]>
To: KAZAR Ayoub <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Pg Hackers <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Manni Wood <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: Mark Wong <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Subject: Re: Speed up COPY TO text/CSV parsing using SIMD
Date: Thu, 26 Mar 2026 16:09:23 -0500
Message-ID: <acWgg0cG7MALI2hB@nathan> (raw)
In-Reply-To: <CA+K2Rum-TB_iNzDWoXOJspf=jq0gd-wees8+9tBTJNyhy9cK5g@mail.gmail.com>
References: <CA+K2Runi_H2CBL0yMm3De2KqcR9RMA0HK5cLJjEhoNszC7myeg@mail.gmail.com>
	<[email protected]>
	<CA+K2Rum_QTZqTUrdMOL5hr-OOpCwGR_9Nj1z15BFObjktMOY6A@mail.gmail.com>
	<abBuKalOno33MQFw@nathan>
	<CA+K2Rum7+Jm2rm65K5msxaiAM8QTkhSNAYarPBP9O7nBXYo12Q@mail.gmail.com>
	<abmiNPQOqBrRlf_m@nathan>
	<CA+K2Rum-TB_iNzDWoXOJspf=jq0gd-wees8+9tBTJNyhy9cK5g@mail.gmail.com>

On Wed, Mar 18, 2026 at 12:02:28AM +0100, KAZAR Ayoub wrote:
>   Test                 Master    v3       v3_var   v3_var_noinl
>   TEXT clean           1504ms   -24.1%   -23.0%   -21.5%
>   CSV clean            1760ms   -34.9%   -32.7%   -33.0%

Nice!

>   TEXT 1/3 backslashes     3763ms    +4.6%    +6.9%   +4.1%
>   CSV 1/3 quotes           3885ms    +3.1%    +2.7%    -0.8%

Hm.  These seem a little bit beyond what we could ignore as noise.

> Wide table TEXT (integer columns):
> 
>   Cols    Master    v3       v3_var   v3_var_noinl
>   50      2083ms   -0.7%    -0.6%    +3.5%
>   100     4094ms   -0.1%    -0.5%    +4.5%
>   200     1560ms   +0.6%    -2.3%    +3.2%
>   500     1905ms   -1.0%    -1.3%    +4.7%
>   1000    1455ms   +1.8%    +0.4%    +4.3%

These numbers look roughly within the noise range.

> Wide table CSV:
> 
>   Cols    Master    v3       v3_var   v3_var_noinl
>   50      2421ms   +4.0%    +6.7%    +5.8%

Hm.  Is this reproducible?  A 4% regression is a bit worrisome.

>   100     4980ms   +0.1%    +2.0%     +0.1%
>   200     1901ms   +1.4%    +3.5%    +1.4%
>   500     2328ms   +1.8%    +2.7%    +2.2%
>   1000    1815ms   +2.0%    +2.8%    +2.5%

These numbers don't bother me too much, but maybe there are some ways to
minimize the regressions further.

-- 
nathan





view thread (13+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Speed up COPY TO text/CSV parsing using SIMD
  In-Reply-To: <acWgg0cG7MALI2hB@nathan>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox