Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vuu0y-00C5is-2w for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Feb 2026 15:07:57 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vuu0x-001UPO-2T for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Feb 2026 15:07:55 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vuu0x-001UPE-1K for pgsql-hackers@lists.postgresql.org; Tue, 24 Feb 2026 15:07:55 +0000 Received: from mail-ed1-x531.google.com ([2a00:1450:4864:20::531]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vuu0u-000000014QO-0Zrj for pgsql-hackers@postgresql.org; Tue, 24 Feb 2026 15:07:55 +0000 Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-65b94e0a875so8109328a12.0 for ; Tue, 24 Feb 2026 07:07:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1771945671; cv=none; d=google.com; s=arc-20240605; b=Ll3Epe3mu2aFOfPJRT5q5CPF23UaLC2txt3Qb0dcSZg+KRCAFSG16wk5UlPGhT1HGE iGeL0vEsmpYl5sZSXolcDa/f8DcbfzWdhqWNWzMRqvr/mrJvB/L9OKWy5vzS/0eA1tfP D+oMB06Ax8gEH83TGCOLIYeClgNDo6X959zqs1cSk0dmjZB8KNwlfkbqq3NhvnnBjglc PfrmN83eI0d7ZASnCkBJOl1gNLiN5KzADmtdPefFcCSOkchlh33Km4DvVueJqvO4BuEX uaMzqQ/y0oKRP05eNB4AEll/5YJrAkLSL5CBOSHu7MVGqxsk7XmZu4d07kxb/v5vfuEa hXwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=DdvQzIyvk2D+81tUAFZWqX4vQugmp/smeYJMOYH5R0w=; fh=5fCbCt487A2J7LcAAgRd9CeAKHKe7qBMLMBZUUsWUe8=; b=JEidJlBUABUrXmVZFY2lzXfK+obkHowQNhNsRE+vr6+1MZToORg3Lt4U4j/4PUp7Rn nMiczXc+whuKozKcU2xXVfoR+JZwep9+Ex1SpQjSJEwOKv5lD1Y5hL2vBp+m632fCqfr z0i3GEYLJdeblVUQCo20JcfcmIiqPXza0uuRSiZ+jz0SmY4To/XSscAhAIcC8fzuXo82 hXI55X2Hc0UjAtpFEVHakvrJN4ZQIQsvKMK8U9CeIv0rwS/PdJmx8rgXYgP8xlHDvBsg vZ4RiQtmSi4f2/ACD1uVb0+NIqmuL6bqgVDvU47pvYZgm6bkRxeJDLvoVwTbrND7tNUi cHpg==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=esi.dz; s=google; t=1771945671; x=1772550471; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=DdvQzIyvk2D+81tUAFZWqX4vQugmp/smeYJMOYH5R0w=; b=KBemFPpiHAWoG08c9wVNML3sg+Q0yQJcyeBYUfHRbq7iyUdb7YixhPLc9/7Sn+d/Bx AdpWObFDZ/v2A0MKbu6RqcPM4J0hIJlT3QFbZd+qvrQq3P57ZddEQQNzp5gsI/zkGPqf DZBuMDLiP9MJ1ITa5TDhJHtm4rKMjmdXxI89jVklyiWcSteXltvrr6f1Stm11E2vIvJN YXhZXTgpgAtE0+6LNvxcp2HZd3MD7AaBAUU3ejXnaZqdJBWVCEqH0ARc9ts+puTy3AvK TSRv29mcige6KhNCraShKojt68Xz3pgu+KaCTo1BRh8Zaj132R7Y0C7mJwtoQiyRf+zw YEOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771945671; x=1772550471; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=DdvQzIyvk2D+81tUAFZWqX4vQugmp/smeYJMOYH5R0w=; b=MyuAdXn/ax55bCOTIuY3l7R12jukHM7gwLCIM/9I4ddRbfWZ++eCkXpq/spohsQJps 8uCyl/dUGZPredLDH9zB0mD2WRZVJGUxWmT8DvqIyedLlHINjCJXMB6HQ/tE777r95py TbtLUttzHfTpyvbpf9zq4jmxYbK4bKrEyNPDDaXczFWx++H+/lUru0TnjpuhsG70N6hA RSC1yp1gqtAsszbGvlYli+W8sHRLeU1SZKZCD5z7pCT3O97SyQo9tcfuuGyWD4UvSE2a TGl6Ge5LP9+5kyeJ/mr5ydyJGvC+tl+BnqWMDyq5F7YZNh60GFgrkjzDSXCB9w/abPuh v48A== X-Forwarded-Encrypted: i=1; AJvYcCWabJggvz6FZoBbPVI9/GmEbkNB+j6UwrW1lDdPaTlslx5euJqvMFiV+X3p66tIvUkOoYxbGqqYNIwSlHbp@postgresql.org X-Gm-Message-State: AOJu0Yy282F21ZdYuKuwtKQf6T9Ras5NnD52Wt19cEsecgl01uvNsGOt JBn8VBkjCP0kCl8dzjmynWahlJxgOm16UDL85b2kcbE46s2YmUsq3a2s4gFbnr6Udc8GHcw3i+g TGF+HpSgcuAe05tynsDloXRvzMqY6legA7UikPtF6 X-Gm-Gg: ATEYQzxCTRDPdrkZ2HWfKJwLKM9dKB3cxZW0I4b/ToTtNu9i0Waisr3sEWDG7kGU3VC rpnB3jY8T5eRK0qp7XOQAkZVH6+SZpGgXUxu2ww9QStGRqwK+uffo1C69DRVAq+fuvXR5z9NR8J Cz5DbwoeX9dytQ7om1XpJZB3zFyHjiAqsuMhSWD3FGDOqjQz+tkk9Y/x42q7/qaOhp7uxAzPZtM VmSwOySttSnOGaqHZDJbi3ayKuJJ2S7DCSVayU5fg+7OmaroB4lv1ePzSQ6gIVezYPyMgR85dyH jqkwu6eLy+yQJIOG0LNV7ivUNhOy+h6o5fniPnpUf5mof1iHO0m4Oqz/bxsrrZZFTp0iPGvMlY+ qy9VEgEaOgNmlWn8LFfMjk3IDB0o= X-Received: by 2002:a05:6402:50ce:b0:65b:f342:d1a2 with SMTP id 4fb4d7f45d1cf-65ea4ec5cc0mr8001678a12.6.1771945670540; Tue, 24 Feb 2026 07:07:50 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: KAZAR Ayoub Date: Tue, 24 Feb 2026 16:07:38 +0100 X-Gm-Features: AaiRm53Cv-VBZn9ZNU_w-UT63TAof1vx4DESjOIczKZ6DBCbg2qInRtRBtUlTZU Message-ID: Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD To: Nazir Bilal Yavuz Cc: Manni Wood , Nathan Bossart , Neil Conway , Andrew Dunstan , Shinya Kato , PostgreSQL-development Content-Type: multipart/alternative; boundary="0000000000004135d7064b934019" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000004135d7064b934019 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello, On Tue, Feb 24, 2026 at 2:57=E2=80=AFPM Nazir Bilal Yavuz wrote: > Hi, > > On Tue, 24 Feb 2026 at 07:44, Manni Wood > wrote: > > > > Hello! > > > > I ran some speed tests on Nazir's v10 SIMD-only patch. I'm a bit > surprised at the regression for x86 with wide rows for the 1/3rd special > characters scenarios. I'm hoping it's something I did wrong. If anyone el= se > has numbers to share, that would be excellent. > > Thank you for doing this! > > I see similar regression on the wide & CSV 1/3 case by using your > benchmark script. I didn't see this regression when I used my > benchmark while sharing v9 [1]. > > +-------------+---------------------------+---------------------------+ > | | Text | CSV | > +-------------+-------------+-------------+-------------+-------------+ > | WIDE TEST | None | 1/3 | None | 1/3 | > +-------------+-------------+-------------+-------------+-------------+ > | Master | 9996 | 10769 | 11548 | 13960 | > +-------------+-------------+-------------+-------------+-------------+ > | v10 | 8912 %-10.8 | 10902 %+1.2 | 8952 %-22.4 | 15123 %+8.3 | > +-------------+-------------+-------------+-------------+-------------+ > | | | | | | > +-------------+-------------+-------------+-------------+-------------+ > | | Text | | CSV | > +-------------+-------------+-------------+-------------+-------------+ > | NARROW TEST | None | 1/3 | None | 1/3 | > +-------------+-------------+-------------+-------------+-------------+ > | Master | 9441 | 9561 | 9734 | 9830 | > +-------------+-------------+-------------+-------------+-------------+ > | v10 | 9291 %-1.5 | 9504 -%0.5 | 9644 %-0.9 | 10078 %-2.4 | > +-------------+-------------+-------------+-------------+-------------+ > > I will investigate this. However, please note that the current master > includes the inlining commit (dc592a4155), which makes the COPY FROM > faster. In my case, > > 1: current master without dc592a4155: 14400ms > 2: current master: 13960ms (%3 improvement against #1) > 3: current master + SIMD: 15123ms (%5 regression against #1 and %8 > regression against #2) > > Is it possible for you to do a similar test? I mean dropping > dc592a4155 from the current master and re-running the benchmark, that > would be helpful. > > [1] > https://postgr.es/m/CAN55FZ0MiFCgK26gRgE05a%3D_ggenkxDM8H%3DA2uTHpywczqt%= 3D-Q%40mail.gmail.com Here are some numbers for v10 from my end, these are multiple long runs: Master contains the previous inlining patch. This is on an Intel I7-1255U CPU WIDE (500k rows) TXT | none Master avg: 20,721 ms New avg: 17,980 ms Improvement: -13.23% CSV | none Master avg: 26,608 ms New avg: 18,433 ms Improvement: -30.73% TXT | escape Master avg: 25,069 ms New avg: 22,910 ms Improvement: -8.61% CSV | quote Master avg: 31,931 ms New avg: 31,493 ms Improvement: -1.37% -------------------------------------- NARROW (15M rows) TXT | none Master avg: 20,687 ms New avg: 20,824 ms Regression: +0.67% CSV | none Master avg: 21,187 ms New avg: 21,153 ms Improvement: -0.16% TXT | escape Master avg: 20,870 ms New avg: 21,341 ms Regression: +2.25% CSV | quote Master avg: 22,074 ms New avg: 22,267 ms Regression: +0.87% For narrow that would be mostly noise and extra branch effects. Regards, Ayoub --0000000000004135d7064b934019 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

On Tue, Feb 24, 2026= at 2:57=E2=80=AFPM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Hi,

On Tue, 24 Feb 2026 at 07:44, Manni Wood <manni.wood@enterprisedb.com> wrot= e:
>
> Hello!
>
> I ran some speed tests on Nazir's v10 SIMD-only patch. I'm a b= it surprised at the regression for x86 with wide rows for the 1/3rd special= characters scenarios. I'm hoping it's something I did wrong. If an= yone else has numbers to share, that would be excellent.

Thank you for doing this!

I see similar regression on the wide & CSV 1/3 case by using your
benchmark script. I didn't see this regression when I used my
benchmark while sharing v9 [1].

+-------------+---------------------------+---------------------------+
|=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 Text=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CSV=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 |
+-------------+-------------+-------------+-------------+-------------+
|=C2=A0 WIDE TEST=C2=A0 |=C2=A0 =C2=A0 =C2=A0None=C2=A0 =C2=A0 |=C2=A0 =C2= =A0 =C2=A01/3=C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0None=C2=A0 =C2=A0 |= =C2=A0 =C2=A0 =C2=A01/3=C2=A0 =C2=A0 =C2=A0|
+-------------+-------------+-------------+-------------+-------------+
|=C2=A0 =C2=A0 Master=C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A09996=C2=A0 =C2=A0 |= =C2=A0 =C2=A0 10769=C2=A0 =C2=A0 |=C2=A0 =C2=A0 11548=C2=A0 =C2=A0 |=C2=A0 = =C2=A0 13960=C2=A0 =C2=A0 |
+-------------+-------------+-------------+-------------+-------------+
|=C2=A0 =C2=A0 =C2=A0v10=C2=A0 =C2=A0 =C2=A0| 8912 %-10.8 | 10902 %+1.2 | 8= 952 %-22.4 | 15123 %+8.3 |
+-------------+-------------+-------------+-------------+-------------+
|=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0|
+-------------+-------------+-------------+-------------+-------------+
|=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 Text=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0CSV=C2=A0 =C2=A0= =C2=A0|
+-------------+-------------+-------------+-------------+-------------+
| NARROW TEST |=C2=A0 =C2=A0 =C2=A0None=C2=A0 =C2=A0 |=C2=A0 =C2=A0 =C2=A01= /3=C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0None=C2=A0 =C2=A0 |=C2=A0 =C2=A0= =C2=A01/3=C2=A0 =C2=A0 =C2=A0|
+-------------+-------------+-------------+-------------+-------------+
|=C2=A0 =C2=A0 Master=C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A09441=C2=A0 =C2=A0 |= =C2=A0 =C2=A0 =C2=A09561=C2=A0 =C2=A0 |=C2=A0 =C2=A0 =C2=A09734=C2=A0 =C2= =A0 |=C2=A0 =C2=A0 =C2=A09830=C2=A0 =C2=A0 |
+-------------+-------------+-------------+-------------+-------------+
|=C2=A0 =C2=A0 =C2=A0v10=C2=A0 =C2=A0 =C2=A0|=C2=A0 9291 %-1.5 |=C2=A0 9504= -%0.5 |=C2=A0 9644 %-0.9 | 10078 %-2.4 |
+-------------+-------------+-------------+-------------+-------------+

I will investigate this. However, please note that the current master
includes the inlining commit (dc592a4155), which makes the COPY FROM
faster. In my case,

1: current master without dc592a4155: 14400ms
2: current master: 13960ms (%3 improvement against #1)
3: current master + SIMD: 15123ms (%5 regression against #1 and %8
regression against #2)

Is it possible for you to do a similar test? I mean dropping
dc592a4155 from the current master and re-running the benchmark, that
would be helpful.

[1] ht= tps://postgr.es/m/CAN55FZ0MiFCgK26gRgE05a%3D_ggenkxDM8H%3DA2uTHpywczqt%3D-Q= %40mail.gmail.com
Here are some numbers for v10 from m= y end, these are multiple long runs:
Master contains the previous= inlining patch.

This is on an Intel I7-1255U CPU

WIDE (500k= rows)

TXT | none
Master avg: 20,721 ms=
New avg: 17,980 ms
Improvement: -13.23%

CSV | none
Master avg: 26,608 msNew avg: 18,433 ms
Improvement: -30.73%
<= br>TXT | escape
Master avg: 25,069 ms
New avg: 22,910 ms

Improvement: -8.61%

CSV | quote
Master avg: 31,931 ms
New avg:= 31,493 ms
Improvement: -1.37%

<= span>--------------------------------------

NARROW (15M= rows)

TXT | none
Master avg: 20,687 ms=
New avg: 20,824 ms
Regression: +0.67%

CSV | none
Master avg: 21,187 ms
= New avg: 21,153 ms
Improvement: -0.16%

= TXT | escape
Master avg: 20,870 ms
Ne= w avg: 21,341 ms
Regression: +2.25%
=
CSV | quote
Master avg: 22,074 ms
New avg: 22,267 ms

Regression: +0.87%<= /p>

For narrow that would be mostly noise and extra branch effects.

Regards,
Ayoub

--0000000000004135d7064b934019--