public inbox for [email protected]  
help / color / mirror / Atom feed
From: Manni Wood <[email protected]>
To: Nazir Bilal Yavuz <[email protected]>
Cc: Nathan Bossart <[email protected]>
Cc: KAZAR Ayoub <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: Mon, 23 Feb 2026 22:44:44 -0600
Message-ID: <CAKWEB6qzsZEQ4Czo9QBFiMXqdXVJknHUJwg6wjRwNzLn4+Jw0g@mail.gmail.com> (raw)
In-Reply-To: <CAN55FZ3cBN_TncLVWyXAKm-KfewguN1AUjyRhoR6zL_QCxHh7A@mail.gmail.com>
References: <CAN55FZ3g6QaiC8G4GMjdJ24egvgc-HG_xpoOztxnM_wnQNn5aw@mail.gmail.com>
	<aY-vJe_ENCB-fux9@nathan>
	<CAN55FZ2OpqRxUUEvgPpHCk2HnY0xZSH1x09fgFGOUyXSv8HcEA@mail.gmail.com>
	<aZYudtuBLVb36pZE@nathan>
	<CAN55FZ0J5iz9wFJLHcK7yNQqPb10_4ROoZiDu1wBZWSGC_fATg@mail.gmail.com>
	<CAKWEB6qY=mU62oAQFAVPCFWvwRuTPKBwxvM2aZ+J7p_9_MBmhQ@mail.gmail.com>
	<CAN55FZ2RPMxquXE6TH7dQkhtoiBcOOOZq8EOXj5COHv3ecP_cw@mail.gmail.com>
	<CA+K2Ru=fFTUVgEDr-fBed5aOMeDbH9vrOEhapXzHEpBeOxkucg@mail.gmail.com>
	<CAKWEB6pq7C0Wv1wT9Y1_c_1fn-+cR8pb210Pj3w2FcEOmNGxbQ@mail.gmail.com>
	<CAN55FZ2DT4-k06umn=7NYG+NoM6gnVJVQCCwRrr2qOraO+Jadw@mail.gmail.com>
	<aZikzQP6WPJ5Rq2S@nathan>
	<CAN55FZ3cBN_TncLVWyXAKm-KfewguN1AUjyRhoR6zL_QCxHh7A@mail.gmail.com>

On Mon, Feb 23, 2026 at 3:10 AM Nazir Bilal Yavuz <[email protected]>
wrote:

> Hi,
>
> On Fri, 20 Feb 2026 at 21:15, Nathan Bossart <[email protected]>
> wrote:
> >
> > Yeah, the couple of small regressions seem close to (or below) the noise
> > level, and IIUC yours were the only benchmarks that showed them, anyway.
> > Plus, I think we'll need this change regardless as a prerequisite for the
> > SIMD work.
> >
> > > Thank you both for the benchmarks. Results look good to me!
> >
> > Committed that part.
>
> Thank you! Attaching the SIMD patch only.
>
> --
> Regards,
> Nazir Bilal Yavuz
> Microsoft
>

Hello!

I ran some speed tests on Nazir's v10 SIMD-only patch. I'm a bit surprised
at the regression for x86 with wide rows for the 1/3rd special characters
scenarios. I'm hoping it's something I did wrong. If anyone else has
numbers to share, that would be excellent.

x86 NARROW master 50,000,000 rows
TXT :                 26359.319000 ms
CSV :                 25661.199750 ms
TXT with 1/3 escapes: 28170.085250 ms
CSV with 1/3 quotes:  32638.147500 ms

x86 NARROW v10 50,000,000 rows
TXT :                 26416.331500 ms  -0.216290% regression
CSV :                 25318.727500 ms  1.334592% improvement
TXT with 1/3 escapes: 28608.007500 ms  -1.554565% regression
CSV with 1/3 quotes:  32805.627750 ms  -0.513143% regression

x86 WIDE master 500,000 rows
TXT :                 26475.164250 ms
CSV :                 31963.478500 ms
TXT with 1/3 escapes: 29671.120750 ms
CSV with 1/3 quotes:  40391.616250 ms

x86 WIDE v10 500,000 rows
TXT :                 23067.046750 ms  12.872885% improvement
CSV :                 23259.092250 ms  27.232287% improvement
TXT with 1/3 escapes: 31796.098250 ms  -7.161770% regression
CSV with 1/3 quotes:  42925.792250 ms  -6.274015% regression



arm NARROW master 25,000,000 rows
TXT :                 10077.096250 ms
CSV :                 10310.671250 ms
TXT with 1/3 escapes: 9893.155000 ms
CSV with 1/3 quotes:  12133.064750 ms

arm NARROW v10 25,000,000 rows
TXT :                 10467.816750 ms  -3.877312% regression
CSV :                 9986.288000 ms  3.146092% improvement
TXT with 1/3 escapes: 10323.173750 ms  -4.346629% regression
CSV with 1/3 quotes:  11843.611750 ms  2.385654% improvement

arm WIDE master 250,000 rows
TXT :                 10568.344750 ms
CSV :                 13046.610500 ms
TXT with 1/3 escapes: 12193.088500 ms
CSV with 1/3 quotes:  16629.319000 ms

arm WIDE v10 250,000 rows
TXT :                 9064.959000 ms  14.225366% improvement
CSV :                 9019.553250 ms  30.866693% improvement
TXT with 1/3 escapes: 12344.497250 ms  -1.241759% regression
CSV with 1/3 quotes:  15495.863750 ms  6.816005% improvement

-- 
-- Manni Wood EDB: https://www.enterprisedb.com


view thread (59+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
  In-Reply-To: <CAKWEB6qzsZEQ4Czo9QBFiMXqdXVJknHUJwg6wjRwNzLn4+Jw0g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox