Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vukID-004LLz-0t for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Feb 2026 04:45:05 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vukIC-00H3W9-0n for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Feb 2026 04:45:04 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vukIB-00H3W0-2y for pgsql-hackers@lists.postgresql.org; Tue, 24 Feb 2026 04:45:03 +0000 Received: from mail-dy1-x132a.google.com ([2607:f8b0:4864:20::132a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vukI7-00000000yhZ-07ch for pgsql-hackers@postgresql.org; Tue, 24 Feb 2026 04:45:01 +0000 Received: by mail-dy1-x132a.google.com with SMTP id 5a478bee46e88-2baab3137bcso4253528eec.0 for ; Mon, 23 Feb 2026 20:44:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1771908297; cv=none; d=google.com; s=arc-20240605; b=S7Xrz28ALVx2XpSaqSSmv/wPsL0ZOYE6WGkF200u2pNO064hrL3qbUiLO8fBqfJlt2 dDwO/JEGXpVBQ0GwD8f9eE5XKsHaain4g2mjiU+yytr0rW88gbu9yvG8h18UcdQwK1fK 8JGbzRiTpzJvD9cpx+1w6tIjNLj05FJYbhc3pFGTRMKPcoJ6nz8yku8V5ODGcX/afjtv Q3tTJjbhQ8RcxUsGdWBOFhVj8eoB+Z986mY+DJyjYkfsKVHYxt6QaXBX1ryEn7HcFEd6 o43SdWfgKpPkgnZ2YWkRB/4m7S5B6XcfsJDmtWkLM69wyz4Tvmk/MlK1+SPhCO01MM4R nIkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=PRRppZ6WwlPkJYEQpMasBRDR2ne3AFam5N839aQGSe8=; fh=JArqw8Betk30xKRE3cQ7tPmfTk/29VxyrLIFb+OrJrk=; b=bolfmFTENwOigytOsQpoDonnbqX+32r67rhi45moERvNlLj6XYABZ0SoJsVU8Ya4GM Bh7EqjsXNlNRnfyzmwN5U+q7Juee8vB1s5mMB0q5Skhe9P89moyqZAIJ2rbo/QoMk0xH VynV0xV+oRaadDyYAE27PPL2XsJpe13tcGd1dZFUKwk9RD8DRHgLbTp63CkeRp5VumDr ek7aVuLEL0lk2nxWbRGyeuFq7gbILc+l5e5CbG4PJQ+7KB0YGvXq/YiWZRAJIzmL3/e3 XIBGJVGU7hDS2hs4ywfAlUqY8T1v3xjidsa+33aRlB31WLkMnspg9zoLsIGchhai55o7 /Ljw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1771908297; x=1772513097; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=PRRppZ6WwlPkJYEQpMasBRDR2ne3AFam5N839aQGSe8=; b=FLKvmc+8GaWIBXDnTjvyMcrlbyQfL4NM1dyglckfznGWeoZQ1kVraQGFhO1JwKVnKX 5dF8wF4OxbCs7h6aOJGe/KtVNKD4GaywyEuB6XdtBCTkAbRM6cjx1RpqGZrxVhkGji8A q1Q4QjaaZEMEfAG+TxKwgxaFq6MCgSDO5SrDnkH+1DtWuJ6OKuN0/vK5eWtRayp7ANss qrgy1nb65783qi/wFmjSmfwwdVoY8jT7rSJI1PzNcriqRvGonNhrItEiTjHPazr+iU1s bzVH973Y7ggUqbK9F0tFAcfNxyqBMTWPBGVtIGxxCjF3L4yOgeUuH5GMWzt4toLVa905 m7sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771908297; x=1772513097; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=PRRppZ6WwlPkJYEQpMasBRDR2ne3AFam5N839aQGSe8=; b=Fdmrf2n7QFPWaHhSNuGxjFvHRT6Q+ObiQ1XHpGKokO1QcBqcVb+tYqolfqIhZTAdRA bvu8BrXmgHRPZtWLKmK3BzwJlPI3UdkzdZwhJCNRLpEkk3EDAY7Tldyyq9RkmkIZHJzy dOxrrfuzSesE7bHyNRjYztbkyHivUBtHja078BKV2s8K1cbveh83+gVFfL6qUhsRj0c7 v3Kx6sVscfs26xXNbUmo1WvmDl+yGF/ziEXqJSNx4yanafU3xEdRqOoRLr4qvbaL7wId FTI4Qb7YiVdhHhEYm3GgIjcIzH6nIqNtnQzAne/dYduTxWEh9c+nVNT+R6ntz5Raw2Rr Zigw== X-Forwarded-Encrypted: i=1; AJvYcCUaRhA8JM7//Unj2s+5wKnxqEaNhZNheeJwjwZsq+y44jNrCh6mv89i92zYQ2huenWWL9085Ub/J7P5sdSS@postgresql.org X-Gm-Message-State: AOJu0YxX8e227miHLXQ9+PYTaru7qgfIsR1Hh4wChs2+joStOmIL4qKW zJOk2skegpxRNCmOWHbApBHRvmhWTjhICGJzBT1+t7YKn6uiph12DHjqgUcwv/bW4fBND4BJaDS sNhXHz8GbAPEaHMX3GZap48bBwMTLNZrZa8uQJPLM X-Gm-Gg: ATEYQzwCWysRhCIQ9e9NAw/bURM/F6oJ6gqR2g63HP+kRiLFjsoCBT9oREpbWg1055K gzX3ql7HwD6Pr5oRU8d+v9hymREeDE0MJuf/8Ge2LP4V/Pg1fVdjJCC18Pn1IGKr80lU6P5eot2 sy9D/BnZe9Vs1FBLGxLNf9f4aDnY8UWJTzLQ9tQlGG0LpyaC1IG5mJaEH/AHlqhhejNrO65v3p5 SzXVyS2ILUNV0+hnxy9k11AXK3EQwNqEpoN6ujFNdjxQK9PNWMp5pjte80sZYIsToMMiX7Prbe1 36eEZNJv7CN4dkLlcAc= X-Received: by 2002:a05:7301:4090:b0:2b7:857:db6a with SMTP id 5a478bee46e88-2bd7bcf9282mr4084171eec.21.1771908296596; Mon, 23 Feb 2026 20:44:56 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Manni Wood Date: Mon, 23 Feb 2026 22:44:44 -0600 X-Gm-Features: AaiRm53DltIer0VsyvBWr626X_TeMENEMWUBAT0O5JwOVUjQZ31d7bLvEeRI0sc Message-ID: Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD To: Nazir Bilal Yavuz Cc: Nathan Bossart , KAZAR Ayoub , Neil Conway , Andrew Dunstan , Shinya Kato , PostgreSQL-development Content-Type: multipart/alternative; boundary="00000000000097ee02064b8a8ca1" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000097ee02064b8a8ca1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Feb 23, 2026 at 3:10=E2=80=AFAM Nazir Bilal Yavuz wrote: > Hi, > > On Fri, 20 Feb 2026 at 21:15, Nathan Bossart > wrote: > > > > Yeah, the couple of small regressions seem close to (or below) the nois= e > > level, and IIUC yours were the only benchmarks that showed them, anyway= . > > Plus, I think we'll need this change regardless as a prerequisite for t= he > > SIMD work. > > > > > Thank you both for the benchmarks. Results look good to me! > > > > Committed that part. > > Thank you! Attaching the SIMD patch only. > > -- > Regards, > Nazir Bilal Yavuz > Microsoft > Hello! I ran some speed tests on Nazir's v10 SIMD-only patch. I'm a bit surprised at the regression for x86 with wide rows for the 1/3rd special characters scenarios. I'm hoping it's something I did wrong. If anyone else has numbers to share, that would be excellent. x86 NARROW master 50,000,000 rows TXT : 26359.319000 ms CSV : 25661.199750 ms TXT with 1/3 escapes: 28170.085250 ms CSV with 1/3 quotes: 32638.147500 ms x86 NARROW v10 50,000,000 rows TXT : 26416.331500 ms -0.216290% regression CSV : 25318.727500 ms 1.334592% improvement TXT with 1/3 escapes: 28608.007500 ms -1.554565% regression CSV with 1/3 quotes: 32805.627750 ms -0.513143% regression x86 WIDE master 500,000 rows TXT : 26475.164250 ms CSV : 31963.478500 ms TXT with 1/3 escapes: 29671.120750 ms CSV with 1/3 quotes: 40391.616250 ms x86 WIDE v10 500,000 rows TXT : 23067.046750 ms 12.872885% improvement CSV : 23259.092250 ms 27.232287% improvement TXT with 1/3 escapes: 31796.098250 ms -7.161770% regression CSV with 1/3 quotes: 42925.792250 ms -6.274015% regression arm NARROW master 25,000,000 rows TXT : 10077.096250 ms CSV : 10310.671250 ms TXT with 1/3 escapes: 9893.155000 ms CSV with 1/3 quotes: 12133.064750 ms arm NARROW v10 25,000,000 rows TXT : 10467.816750 ms -3.877312% regression CSV : 9986.288000 ms 3.146092% improvement TXT with 1/3 escapes: 10323.173750 ms -4.346629% regression CSV with 1/3 quotes: 11843.611750 ms 2.385654% improvement arm WIDE master 250,000 rows TXT : 10568.344750 ms CSV : 13046.610500 ms TXT with 1/3 escapes: 12193.088500 ms CSV with 1/3 quotes: 16629.319000 ms arm WIDE v10 250,000 rows TXT : 9064.959000 ms 14.225366% improvement CSV : 9019.553250 ms 30.866693% improvement TXT with 1/3 escapes: 12344.497250 ms -1.241759% regression CSV with 1/3 quotes: 15495.863750 ms 6.816005% improvement --=20 -- Manni Wood EDB: https://www.enterprisedb.com --00000000000097ee02064b8a8ca1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On M= on, Feb 23, 2026 at 3:10=E2=80=AFAM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Hi,

On Fri, 20 Feb 2026 at 21:15, Nathan Bossart <nathandbossart@gmail.com> wrote:=
>
> Yeah, the couple of small regressions seem close to (or below) the noi= se
> level, and IIUC yours were the only benchmarks that showed them, anywa= y.
> Plus, I think we'll need this change regardless as a prerequisite = for the
> SIMD work.
>
> > Thank you both for the benchmarks. Results look good to me!
>
> Committed that part.

Thank you! Attaching the SIMD patch only.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Hello!

<= div>I ran some speed tests on Nazir's v10 SIMD-only patch. I'm a bi= t surprised at the regression for x86 with wide rows for the 1/3rd special = characters scenarios. I'm hoping it's something I did wrong. If any= one else has numbers to share, that would be excellent.

x86 NARROW master 50,000,000 rows
TXT : =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 26359.319000 ms
CSV : =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 25661.199750 ms
TXT with 1/3 esca= pes: 28170.085250 ms
CSV with 1/3 quotes: =C2=A032638.147500 ms

x= 86 NARROW v10 50,000,000 rows
TXT : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 26416.331500 ms =C2=A0-0.216290% regression
CSV : = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 25318.727500 ms =C2= =A01.334592% improvement
TXT with 1/3 escapes: 28608.007500 ms =C2=A0-1.= 554565% regression
CSV with 1/3 quotes: =C2=A032805.627750 ms =C2=A0-0.5= 13143% regression

x86 WIDE master 500,000 rows
TXT : =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 26475.164250 ms
CSV : =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 31963.478500 ms
TXT= with 1/3 escapes: 29671.120750 ms
CSV with 1/3 quotes: =C2=A040391.6162= 50 ms

x86 WIDE v10 500,000 rows
TXT : =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 23067.046750 ms =C2=A012.872885% improvementCSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 23259.09225= 0 ms =C2=A027.232287% improvement
TXT with 1/3 escapes: 31796.098250 ms = =C2=A0-7.161770% regression
CSV with 1/3 quotes: =C2=A042925.792250 ms = =C2=A0-6.274015% regression



arm NARROW master 25,000,000 row= s
TXT : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 10077.09= 6250 ms
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 10= 310.671250 ms
TXT with 1/3 escapes: 9893.155000 ms
CSV with 1/3 quote= s: =C2=A012133.064750 ms

arm NARROW v10 25,000,000 rows
TXT : =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 10467.816750 ms =C2=A0= -3.877312% regression
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 9986.288000 ms =C2=A03.146092% improvement
TXT with 1/3 es= capes: 10323.173750 ms =C2=A0-4.346629% regression
CSV with 1/3 quotes: = =C2=A011843.611750 ms =C2=A02.385654% improvement

arm WIDE master 25= 0,000 rows
TXT : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 10568.344750 ms
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 13046.610500 ms
TXT with 1/3 escapes: 12193.088500 ms
CSV with= 1/3 quotes: =C2=A016629.319000 ms

arm WIDE v10 250,000 rows
TXT = : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 9064.959000 ms = =C2=A014.225366% improvement
CSV : =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 9019.553250 ms =C2=A030.866693% improvement
TXT wit= h 1/3 escapes: 12344.497250 ms =C2=A0-1.241759% regression
CSV with 1/3 = quotes: =C2=A015495.863750 ms =C2=A06.816005% improvement
<= br>
--
-- Manni Wood EDB: https://www.enterprisedb.com
--00000000000097ee02064b8a8ca1--