Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vzK4h-000th0-0o for pgsql-hackers@arkaria.postgresql.org; Sun, 08 Mar 2026 19:46:03 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vzK4f-00BWMc-1x for pgsql-hackers@arkaria.postgresql.org; Sun, 08 Mar 2026 19:46:02 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vzK4f-00BWKR-0o for pgsql-hackers@lists.postgresql.org; Sun, 08 Mar 2026 19:46:01 +0000 Received: from mail-dy1-x1334.google.com ([2607:f8b0:4864:20::1334]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vzK4c-00000001ffy-11QI for pgsql-hackers@postgresql.org; Sun, 08 Mar 2026 19:46:00 +0000 Received: by mail-dy1-x1334.google.com with SMTP id 5a478bee46e88-2be19f05d7dso4492040eec.1 for ; Sun, 08 Mar 2026 12:45:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1772999155; cv=none; d=google.com; s=arc-20240605; b=UcodtkqnsdxXeeN51f321vJCHTdVGHyK0C5cyObQnAz+0mIcZENbyFBZfe+1KI1mn0 /ngBIdM0z1E/T6lqnfD+eyM7UNpah7oy95lrDn+/J+o5NfRAwkDUGutXWoNTZnlEZSmg EEHfbsY+hhExQZKaCxYwZ5mECUv5G3III/D/WFAweKQPEAPzYlNSx9CUegiIRYfH6LHG BW7yHpzIwnCVgmE3KLDVsuVHiSDqySrCE8bI6sxzXOXpKt1xs+GmscAlzK2zD45wj6/A supj0rxXiZEeoIGUiYIT94JpLhA3Ay1LmBnsqmFOwUWCqjQTB1xBXwvdxO7tiuZGBgt0 xEVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=SMJvlrHL/BrU5+EgEJzDz7sujSr7wNbVBZ8yfRUriiI=; fh=VJHw4osP73e4ythZxXsuBa8LVovnRt/ArBZDDTZDIDs=; b=i9DuDNNJRw0dg3/m9aMQrYJBmxpp8YXddK/RaS9ikRM0CAM8lNyzIEaXLfrV1Ehcus HTUNyBtulLdqN2vO6cgN/Qc+A5XStAmmlABk/3tfAVsrddjAeB5VuX9HC89CP2HqS+el a+AWAA05jbMqkFSzS7Qpz6kQNo6FQXxxvhlg7IJ/m89+uu5JiyLglhgETnQuYkioq15m MZ4rilyg3HA9474lJ74ARkFbBr0N4t9DXe14PVWNMgLtGbhcm2g43lqYqABdp1rHjwIs qnGXSRZnmjivdQDUXlq6SzQzH9+uPzLvHBb1Buub4FESpB7oxWVXHUErDwSFjEV8gqeP 2b1A==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1772999155; x=1773603955; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=SMJvlrHL/BrU5+EgEJzDz7sujSr7wNbVBZ8yfRUriiI=; b=Aobhf7T6hdPZAYIbn036NYpcCyMoqnruApS3u9IgWBf7Iwzpziov1XTu4x9A7vH+oi 48udZ4cOpqT3X715B/4Td+nCgObrUOSkEa3iLCxj0r8R+I6Fyb/FCvpK1xcbbrc8ZYPS G8rxzVx22jqN+UCDZ2ncy15jhEjvbZLn8knd2WsrZWJ01sdYfBetQnLqQOtB1j21dRnZ moTbLdYMJ62SgLvkC/ilpv80caW0EaT7TShWHbl21DY2WjpL6+jFzh8UAuk6iGNwrt1v jGLcwc3zOcOSymSu9hotjhFJS0ot/acuU7eTVQligutrtYfUapQxX72LFiZ7daye4PLO MJ7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772999155; x=1773603955; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=SMJvlrHL/BrU5+EgEJzDz7sujSr7wNbVBZ8yfRUriiI=; b=apwOkS9IdcmqQAEHmsRkvC3iiaF+YTAjuObBwOAgXn96QiKQFuVaHnQwSwvCsmSvHc 3VbafHbDr9RV49FrwuTgUK8h4N2CxVH6CLRHF1zNezORrvpx4QczzkhxG8z44NGpI4f6 +qAJHQ/3GoXpMf810XkmbDqTjZqQ7/vYukqC1A3/9SUJs9BvG7xyTXJ4L1zJ1iQZHpjN eiEYNgnn00b5AvUmcQcMJiush38a5PtAKA8Z9C8+DGwePWoc6TIS+M12mxKHBmFItdQe /OFLx8RaEz7I73lZfwpw7DjdhHBgYsVR4uc0BiMHVncrZlSFjPIQz1RWmyIe5MWwTIrd SmXw== X-Forwarded-Encrypted: i=1; AJvYcCVmm40wfPjD3MRG1oQkZzwK5xEZQ0uPgzzdetRW9gc1TE8zOJ39FNTl7D667EqrSALFimVh33edsONAJBpd@postgresql.org X-Gm-Message-State: AOJu0YyuegJyrhZt2lypgSijobFKTp8y3Ka6UgbWK60zOed4qkHdlbtn RIZB+Q/Qka3yWMowdnPYhyrNd8ohbWzfw4c6tWWLhj8yMlOFN81ZOqakj6j33O0KfqC+VOpSb2n mLWQhONPk5wUkCujcZ8bUhLy5S9NUTHqCC5jcQJsq X-Gm-Gg: ATEYQzzrBdy1KVATa7dTPun0+KIEbDZKfabJfSIVWuqe4x1TaODhKt4ZsofWJmC5jCZ LvtikLdatJKsmeUeKrjxYZPAQ55nL9PegI45OFa+IlEAZ0+AIv8oLUXnRablpgA+nu8jyGo709/ BFjmnWcCUG0M3NZ6lEoqfXQxVSP0vLYd1RSKL8+tZabSm7wMw2uTs4P2iEQ0IUeg+FDakyN0BX8 fqAaZGORmXKHindDth31br3eAvocyBQGt1y3roPTjrwwolov/9w1e52rMKhCTbTG6vSAn85BABY Hlnzm6qybg== X-Received: by 2002:a05:7300:7494:b0:2bd:fff8:c7ea with SMTP id 5a478bee46e88-2be4e06a7damr3769604eec.38.1772999155556; Sun, 08 Mar 2026 12:45:55 -0700 (PDT) MIME-Version: 1.0 References: <91acb778-42c4-44ef-8888-f18ad9b12a5b@dunslane.net> In-Reply-To: From: Manni Wood Date: Sun, 8 Mar 2026 14:45:44 -0500 X-Gm-Features: AaiRm53npleLHwnLxgWibn-V9IfS_II_uVhLLTmzN2hQRv8Bt_A9cOKLGQVXO5M Message-ID: Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD To: Nazir Bilal Yavuz Cc: KAZAR Ayoub , Nathan Bossart , Andrew Dunstan , Neil Conway , Shinya Kato , PostgreSQL-development Content-Type: multipart/alternative; boundary="000000000000dab809064c8888a8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000dab809064c8888a8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Mar 8, 2026 at 5:31=E2=80=AFAM Nazir Bilal Yavuz wrote: > Hi, > > On Sat, 7 Mar 2026 at 02:31, KAZAR Ayoub wrote: > > > > On Sat, Mar 7, 2026 at 12:13=E2=80=AFAM Nathan Bossart > wrote: > >> > >> On Fri, Mar 06, 2026 at 03:25:46PM -0600, Manni Wood wrote: > >> > Well, golly! Look at these numbers. Old master with no lz4, your v11 > patch > >> > with no lz4, and then your v11 patch with lz4 compiled in. > >> > >> I'm appreciative of all the benchmarking that you and others are doing= , > but > >> wouldn't we be more interested in the difference between "old master > with > >> lz4" and "v11 with lz4"? Else, we have multiple variables in play. > > > > Yes I agree because the lz4 effect doesn't prove anything for the SIMD > patch itself right ? So basically a comparison for the SIMD effect should > be "master with/out lz4 vs patched with/out lz4, respectively and nothing > more!", is this correct ? > > Yes, I think 'master with/out lz4 vs patched with/out lz4, > respectively' is enough to determine the effect of the SIMD patch. > > -- > Regards, > Nazir Bilal Yavuz > Microsoft > Hello! As requested, here are some numbers based on the latest master but with the copy code inlining excised (`git revert dc592a41557b072178f1798700bf9c69cd8e4235`), compared to master with copy code inlining left in place and the v11 patch applied. Both results have lz4 compression in place. I have not run numbers without lz4. I assume I could use the two postgres instances that I have compiled with lz4, but just set `default_toast_compression =3D pglz` in postgesql.conf for both instances. Let me know if that is a mistaken assumption on my part. arm NARROW master without inline with lz4 TXT : 10362.799500 ms CSV : 10288.791000 ms TXT with 1/3 escapes: 10411.416250 ms CSV with 1/3 quotes: 12318.385750 ms arm NARROW master with inline with lz4 with v11patch TXT : 10317.125750 ms 0.440747% improvement CSV : 10418.020250 ms -1.256020% regression TXT with 1/3 escapes: 10188.319500 ms 2.142809% improvement CSV with 1/3 quotes: 12032.964500 ms 2.317035% improvement arm WIDE master without inline with lz4 TXT : 5608.834500 ms CSV : 8115.155000 ms TXT with 1/3 escapes: 7037.290500 ms CSV with 1/3 quotes: 10894.615750 ms arm WIDE master with inline with lz4 with v11patch TXT : 3190.268750 ms 43.120647% improvement CSV : 3135.177000 ms 61.366394% improvement TXT with 1/3 escapes: 6373.746750 ms 9.428966% improvement CSV with 1/3 quotes: 10336.763500 ms 5.120440% improvement x86 NARROW-master-without-inline-with-lz4.log TXT : 26701.079250 ms CSV : 26492.235500 ms TXT with 1/3 escapes: 28590.508250 ms CSV with 1/3 quotes: 34876.742750 ms x86 NARROW-master-with-inline-with-lz4-with-v11patch.log TXT : 26511.747750 ms 0.709078% improvement CSV : 26261.269750 ms 0.871824% improvement TXT with 1/3 escapes: 27702.964750 ms 3.104329% improvement CSV with 1/3 quotes: 32339.393000 ms 7.275191% improvement x86 WIDE-master-without-inline-with-lz4.log TXT : 14485.563250 ms CSV : 21392.582000 ms TXT with 1/3 escapes: 18081.514750 ms CSV with 1/3 quotes: 32547.086250 ms x86 WIDE-master-with-inline-with-lz4-with-v11patch.log TXT : 8080.378250 ms 44.217714% improvement CSV : 8283.723000 ms 61.277591% improvement TXT with 1/3 escapes: 15054.111000 ms 16.743087% improvement CSV with 1/3 quotes: 25668.009750 ms 21.135768% improvement --=20 -- Manni Wood EDB: https://www.enterprisedb.com --000000000000dab809064c8888a8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sun, Mar 8, = 2026 at 5:31=E2=80=AFAM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Hi,

On Sat, 7 Mar 2026 at 02:31, KAZAR Ayoub <ma_kazar@esi.dz> wrote:
>
> On Sat, Mar 7, 2026 at 12:13=E2=80=AFAM Nathan Bossart <nathandbossart@gmail.com= > wrote:
>>
>> On Fri, Mar 06, 2026 at 03:25:46PM -0600, Manni Wood wrote:
>> > Well, golly! Look at these numbers. Old master with no lz4, y= our v11 patch
>> > with no lz4, and then your v11 patch with lz4 compiled in. >>
>> I'm appreciative of all the benchmarking that you and others a= re doing, but
>> wouldn't we be more interested in the difference between "= ;old master with
>> lz4" and "v11 with lz4"?=C2=A0 Else, we have multip= le variables in play.
>
> Yes I agree because the lz4 effect doesn't prove anything for the = SIMD patch itself right ? So basically a comparison for the SIMD effect sho= uld be "master with/out lz4 vs patched with/out lz4, respectively and = nothing more!", is this correct ?

Yes, I think 'master with/out lz4 vs patched with/out lz4,
respectively' is enough to determine the effect of the SIMD patch.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Hello!

<= /div>
As requested, here are some numbers based on the latest master bu= t with the copy code inlining excised (`git revert dc592a41557b072178f17987= 00bf9c69cd8e4235`), compared to master with copy code inlining left in plac= e and the v11 patch applied.
Both results have lz4 compression in= place.

I have not run numbers without lz4. I assu= me I could use the two postgres instances that I have compiled with lz4, bu= t just set `default_toast_compression =3D pglz` in postgesql.conf for both = instances. Let me know if that is a mistaken assumption on my part.

arm NARROW master without inline with lz4
TXT : =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 10362.799500 ms
CSV : = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 10288.791000 ms
= TXT with 1/3 escapes: 10411.416250 ms
CSV with 1/3 quotes: =C2=A012318.3= 85750 ms

arm NARROW master with inline with lz4 with v11patch
TXT= : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 10317.125750 ms = =C2=A00.440747% improvement
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 10418.020250 ms -1.256020% regression
TXT with 1/3 esc= apes: 10188.319500 ms =C2=A02.142809% improvement
CSV with 1/3 quotes: = =C2=A012032.964500 ms =C2=A02.317035% improvement


arm WIDE maste= r without inline with lz4
TXT : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A05608.834500 ms
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A08115.155000 ms
TXT with 1/3 escapes: = =C2=A07037.290500 ms
CSV with 1/3 quotes: =C2=A010894.615750 ms

a= rm WIDE master with inline with lz4 with v11patch
TXT : =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03190.268750 ms =C2=A043.120= 647% improvement
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A03135.177000 ms =C2=A061.366394% improvement
TXT with 1/3 es= capes: =C2=A06373.746750 ms =C2=A0 9.428966% improvement
CSV with 1/3 qu= otes: =C2=A010336.763500 ms =C2=A0 5.120440% improvement



x86= NARROW-master-without-inline-with-lz4.log
TXT : =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 26701.079250 ms
CSV : =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 26492.235500 ms
TXT with 1/3 e= scapes: 28590.508250 ms
CSV with 1/3 quotes: =C2=A034876.742750 ms
x86 NARROW-master-with-inline-with-lz4-with-v11patch.log
TXT : =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 26511.747750 ms =C2=A00.70= 9078% improvement
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 26261.269750 ms =C2=A00.871824% improvement
TXT with 1/3 escapes= : 27702.964750 ms =C2=A03.104329% improvement
CSV with 1/3 quotes: =C2= =A032339.393000 ms =C2=A07.275191% improvement


x86 WIDE-master-w= ithout-inline-with-lz4.log
TXT : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 14485.563250 ms
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 21392.582000 ms
TXT with 1/3 escapes: 18081.514= 750 ms
CSV with 1/3 quotes: =C2=A032547.086250 ms

x86 WIDE-master= -with-inline-with-lz4-with-v11patch.log
TXT : =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A08080.378250 ms =C2=A044.217714% impro= vement
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A08283.723000 ms =C2=A061.277591% improvement
TXT with 1/3 escapes: 150= 54.111000 ms =C2=A016.743087% improvement
CSV with 1/3 quotes: =C2=A0256= 68.009750 ms =C2=A021.135768% improvement
--
-- Manni Wood EDB: https://www.enterprisedb.com
--000000000000dab809064c8888a8--