public inbox for [email protected]
help / color / mirror / Atom feedFrom: Nazir Bilal Yavuz <[email protected]>
To: Nathan Bossart <[email protected]>
Cc: KAZAR Ayoub <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Manni Wood <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Shinya Kato <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: Fri, 13 Feb 2026 14:45:30 +0300
Message-ID: <CAN55FZ3g6QaiC8G4GMjdJ24egvgc-HG_xpoOztxnM_wnQNn5aw@mail.gmail.com> (raw)
In-Reply-To: <aY0FL4rXUl6ykn-a@nathan>
References: <CAKWEB6oZdQhhBV3ojHLBwjQgKzfDw0fkqncurt9oi7vNsq41ww@mail.gmail.com>
<CAN55FZ1p5UyUdTRO7iWR_ukjhJDOnpOR2rYNOq=+hcC45OuahQ@mail.gmail.com>
<CAOW5sYZEx=fPw2wp7y2nK_-ifXFeYW4CTmFx_OQeoHFjG7rbHw@mail.gmail.com>
<CA+K2Ru=C_woAnd-3-pGHoNSTR8FOf=7eeSWE1xaLt9ojVWndVg@mail.gmail.com>
<CAN55FZ0FRB2OD6-oEESLvgUT4bLZQVD72pAqUqzdw7Rx5cN0ig@mail.gmail.com>
<CA+K2Run1VdLnmp-5_Qv2Fax0KgT7LLJMH-uzjaaf-NZD1oU-=w@mail.gmail.com>
<aYZdKSTw6N3khsVE@nathan>
<CAN55FZ2DOeLjSXE2Jos99bgHG-Zeo3KjStrSgoA8Rf=2Mu+hFA@mail.gmail.com>
<aYZvdsXPElQvwWOA@nathan>
<CAN55FZ1=O6TjeZM2CUT7T2tu66uJT+w3G9FiRXVs+gt_ousFxQ@mail.gmail.com>
<aY0FL4rXUl6ykn-a@nathan>
Hi,
Thanks for the review!
On Thu, 12 Feb 2026 at 01:39, Nathan Bossart <[email protected]> wrote:
>
> On Wed, Feb 11, 2026 at 04:27:50PM +0300, Nazir Bilal Yavuz wrote:
> > I am sharing a v6 which implements (1). My benchmark results show
> > almost no difference for the special-character cases and a nice
> > improvement for the no-special-character cases.
>
> Thanks!
>
> > + /* Initialize SIMD variables */
> > + cstate->simd_enabled = false;
> > + cstate->simd_initialized = false;
>
> > + /* Initialize SIMD on the first read */
> > + if (unlikely(!cstate->simd_initialized))
> > + {
> > + cstate->simd_initialized = true;
> > + cstate->simd_enabled = true;
> > + }
>
> Why do we do this initialization in CopyReadLine() as opposed to setting
> simd_enabled to true when initializing cstate in BeginCopyFrom()? If we
> can initialize it in BeginCopyFrom, we could probably remove
> simd_initialized.
Correct, I guess this is left over from the earlier versions.
> > + if (cstate->simd_enabled)
> > + result = CopyReadLineText(cstate, is_csv, true);
> > + else
> > + result = CopyReadLineText(cstate, is_csv, false);
>
> I know we discussed this upthread, but I'd like to take a closer look at
> this to see whether/why it makes such a big difference. It's a bit awkward
> that CopyReadLineText() needs to manage both its local simd_enabled and
> cstate->simd_enabled.
I extensively benchmarked this with the new v6 version. If I change
this to either of:
CopyReadLineText(cstate, is_csv);
or
CopyReadLineText(cstate, is_csv, cstate->simd_enabled);
then there is %5-%10 regression for the scalar path. I ran my
benchmarks with both "meson --buildtype=debugoptimized" and "meson
--buildtype=release" but the result is the same.
Also, if I change this code to:
if (cstate->simd_enabled)
{
if (is_csv)
result = CopyReadLineText(cstate, true, true);
else
result = CopyReadLineText(cstate, false, true);
}
else
{
if (is_csv)
result = CopyReadLineText(cstate, true, false);
else
result = CopyReadLineText(cstate, false, false);
}
then I see ~%5 performance improvement in scalar path compared to master.
> + /* Load a chunk of data into a vector register */
> + vector8_load(&chunk, (const uint8 *) ©_input_buf[input_buf_ptr]);
>
> As mentioned upthread [0], I think it's worth testing whether processing
> multiple vectors worth of data in each loop iteration is worthwhile.
>
> [0] https://postgr.es/m/aSTVOe6BIe5f1l3i%40nathan
There are multiple keys in CopyReadLineText() compared to
pg_lfind32(). I am not sure if I correctly used multiple vectors but I
attached what I did as 0002, could you please look at it? I didn't see
any performance benefit in my benchmarks, though.
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachments:
[text/x-patch] v7-0001-Speed-up-COPY-FROM-text-CSV-parsing-using-SIMD.patch (7.8K, 2-v7-0001-Speed-up-COPY-FROM-text-CSV-parsing-using-SIMD.patch)
download | inline diff:
From c4b29849ad9f87f51022b947a9a0ab695dd1cde2 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz <[email protected]>
Date: Fri, 13 Feb 2026 13:28:55 +0300
Subject: [PATCH v7 1/2] Speed up COPY FROM text/CSV parsing using SIMD
This patch disables SIMD when SIMD encounters a special character which
is neither EOF nor EOL.
Author: Shinya Kato <[email protected]>
Author: Nazir Bilal Yavuz <[email protected]>
Reviewed-by: Kazar Ayoub <[email protected]>
Reviewed-by: Nathan Bossart <[email protected]>
Reviewed-by: Neil Conway <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
Reviewed-by: Manni Wood <[email protected]>
Reviewed-by: Mark Wong <[email protected]>
Discussion: https://postgr.es/m/CAOzEurSW8cNr6TPKsjrstnPfhf4QyQqB4tnPXGGe8N4e_v7Jig%40mail.gmail.com
---
src/backend/commands/copyfrom.c | 3 +
src/backend/commands/copyfromparse.c | 125 ++++++++++++++++++++++-
src/include/commands/copyfrom_internal.h | 3 +
3 files changed, 126 insertions(+), 5 deletions(-)
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 25ee20b23db..40dae0bdacc 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1721,6 +1721,9 @@ BeginCopyFrom(ParseState *pstate,
cstate->cur_attval = NULL;
cstate->relname_only = false;
+ /* Initialize SIMD */
+ cstate->simd_enabled = true;
+
/*
* Allocate buffers for the input pipeline.
*
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 94d6f415a06..4a127d1af90 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -72,6 +72,7 @@
#include "miscadmin.h"
#include "pgstat.h"
#include "port/pg_bswap.h"
+#include "port/simd.h"
#include "utils/builtins.h"
#include "utils/rel.h"
@@ -141,12 +142,14 @@ static const char BinarySignature[11] = "PGCOPY\n\377\r\n\0";
/* non-export function prototypes */
static bool CopyReadLine(CopyFromState cstate, bool is_csv);
-static bool CopyReadLineText(CopyFromState cstate, bool is_csv);
static int CopyReadAttributesText(CopyFromState cstate);
static int CopyReadAttributesCSV(CopyFromState cstate);
static Datum CopyReadBinaryAttribute(CopyFromState cstate, FmgrInfo *flinfo,
Oid typioparam, int32 typmod,
bool *isnull);
+static pg_attribute_always_inline bool CopyReadLineText(CopyFromState cstate,
+ bool is_csv,
+ bool simd_enabled);
static pg_attribute_always_inline bool CopyFromTextLikeOneRow(CopyFromState cstate,
ExprContext *econtext,
Datum *values,
@@ -1173,8 +1176,14 @@ CopyReadLine(CopyFromState cstate, bool is_csv)
resetStringInfo(&cstate->line_buf);
cstate->line_buf_valid = false;
- /* Parse data and transfer into line_buf */
- result = CopyReadLineText(cstate, is_csv);
+ /*
+ * Parse data and transfer into line_buf. To benefit from inlining, call
+ * CopyReadLineText() with constant boolean arguments.
+ */
+ if (cstate->simd_enabled)
+ result = CopyReadLineText(cstate, is_csv, true);
+ else
+ result = CopyReadLineText(cstate, is_csv, false);
if (result)
{
@@ -1241,8 +1250,8 @@ CopyReadLine(CopyFromState cstate, bool is_csv)
/*
* CopyReadLineText - inner loop of CopyReadLine for text mode
*/
-static bool
-CopyReadLineText(CopyFromState cstate, bool is_csv)
+static pg_attribute_always_inline bool
+CopyReadLineText(CopyFromState cstate, bool is_csv, bool simd_enabled)
{
char *copy_input_buf;
int input_buf_ptr;
@@ -1257,6 +1266,14 @@ CopyReadLineText(CopyFromState cstate, bool is_csv)
char quotec = '\0';
char escapec = '\0';
+#ifndef USE_NO_SIMD
+ Vector8 nl = vector8_broadcast('\n');
+ Vector8 cr = vector8_broadcast('\r');
+ Vector8 bs = vector8_broadcast('\\');
+ Vector8 quote = vector8_broadcast(0);
+ Vector8 escape = vector8_broadcast(0);
+#endif
+
if (is_csv)
{
quotec = cstate->opts.quote[0];
@@ -1264,6 +1281,12 @@ CopyReadLineText(CopyFromState cstate, bool is_csv)
/* ignore special escape processing if it's the same as quotec */
if (quotec == escapec)
escapec = '\0';
+
+#ifndef USE_NO_SIMD
+ quote = vector8_broadcast(quotec);
+ if (quotec != escapec)
+ escape = vector8_broadcast(escapec);
+#endif
}
/*
@@ -1330,6 +1353,98 @@ CopyReadLineText(CopyFromState cstate, bool is_csv)
need_data = false;
}
+#ifndef USE_NO_SIMD
+
+ /*
+ * Use SIMD instructions to efficiently scan the input buffer for
+ * special characters (e.g., newline, carriage return, quote, and
+ * escape). This is faster than byte-by-byte iteration, especially on
+ * large buffers.
+ *
+ * We do not apply the SIMD fast path in either of the following
+ * cases: - When the previously processed character was an escape
+ * character (last_was_esc), since the next byte must be examined
+ * sequentially. - When the remaining buffer is smaller than one
+ * vector width (sizeof(Vector8)), since SIMD operates on fixed-size
+ * chunks.
+ *
+ * Note that, SIMD may become slower when the input contains many
+ * special characters. To avoid this regression, we disable SIMD for
+ * the rest of the input once we encounter a special character which
+ * is neither EOF nor EOL.
+ */
+ if (simd_enabled && !last_was_esc && copy_buf_len - input_buf_ptr > sizeof(Vector8))
+ {
+ Vector8 chunk;
+ Vector8 match = vector8_broadcast(0);
+ uint32 mask;
+
+ /* Load a chunk of data into a vector register */
+ vector8_load(&chunk, (const uint8 *) ©_input_buf[input_buf_ptr]);
+
+ if (is_csv)
+ {
+ /* \n and \r are not special inside quotes */
+ if (!in_quote)
+ match = vector8_or(vector8_eq(chunk, nl), vector8_eq(chunk, cr));
+
+ match = vector8_or(match, vector8_eq(chunk, quote));
+ if (escapec != '\0')
+ match = vector8_or(match, vector8_eq(chunk, escape));
+ }
+ else
+ {
+ match = vector8_or(vector8_eq(chunk, nl), vector8_eq(chunk, cr));
+ match = vector8_or(match, vector8_eq(chunk, bs));
+ }
+
+ /* Check if we found any special characters */
+ mask = vector8_highbit_mask(match);
+ if (mask != 0)
+ {
+ /*
+ * Found a special character. Advance up to that point and let
+ * the scalar code handle it.
+ */
+ int advance = pg_rightmost_one_pos32(mask);
+ char c1,
+ c2;
+ bool simd_hit_eol,
+ simd_hit_eof;
+
+ input_buf_ptr += advance;
+ c1 = copy_input_buf[input_buf_ptr];
+
+ /*
+ * Since we stopped within the chunk and ((copy_buf_len -
+ * input_buf_ptr) > sizeof(Vector8)) is true,
+ * copy_input_buf[input_buf_ptr + 1] is guaranteed to be
+ * readable.
+ */
+ c2 = copy_input_buf[input_buf_ptr + 1];
+ simd_hit_eol = (c1 == '\r' || c1 == '\n') && (!is_csv || !in_quote);
+ simd_hit_eof = c1 == '\\' && c2 == '.' && !is_csv;
+
+ /*
+ * Do not disable SIMD when we hit EOL or EOF characters. In
+ * practice, it does not matter for EOF because parsing ends
+ * there, but we keep the behavior consistent.
+ */
+ if (!(simd_hit_eof || simd_hit_eol))
+ {
+ simd_enabled = false;
+ cstate->simd_enabled = false;
+ }
+ }
+ else
+ {
+ /* No special characters found, so skip the entire chunk */
+ input_buf_ptr += sizeof(Vector8);
+ continue;
+ }
+ }
+#endif
+
/* OK to fetch a character */
prev_raw_ptr = input_buf_ptr;
c = copy_input_buf[input_buf_ptr++];
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 822ef33cf69..73ce777c52b 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -89,6 +89,9 @@ typedef struct CopyFromStateData
const char *cur_attval; /* current att value for error messages */
bool relname_only; /* don't output line number, att, etc. */
+ /* SIMD variables */
+ bool simd_enabled;
+
/*
* Working state
*/
--
2.47.3
[text/x-patch] v7-0002-Use-4-vectors-in-CopyReadLineText-SIMD.patch (6.4K, 3-v7-0002-Use-4-vectors-in-CopyReadLineText-SIMD.patch)
download | inline diff:
From 2de9b5bc18bfa169b3ba3507b6bdf79d277c0ad4 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz <[email protected]>
Date: Fri, 13 Feb 2026 13:36:34 +0300
Subject: [PATCH v7 2/2] Use 4 vectors in CopyReadLineText() SIMD
---
src/backend/commands/copyfromparse.c | 116 +++++++++++++++++++++------
1 file changed, 92 insertions(+), 24 deletions(-)
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 4a127d1af90..caadc40cc8b 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -1361,6 +1361,9 @@ CopyReadLineText(CopyFromState cstate, bool is_csv, bool simd_enabled)
* escape). This is faster than byte-by-byte iteration, especially on
* large buffers.
*
+ * For better instruction-level parallelism, we try to process four
+ * vectors at a time.
+ *
* We do not apply the SIMD fast path in either of the following
* cases: - When the previously processed character was an escape
* character (last_was_esc), since the next byte must be examined
@@ -1373,53 +1376,118 @@ CopyReadLineText(CopyFromState cstate, bool is_csv, bool simd_enabled)
* the rest of the input once we encounter a special character which
* is neither EOF nor EOL.
*/
- if (simd_enabled && !last_was_esc && copy_buf_len - input_buf_ptr > sizeof(Vector8))
+ if (simd_enabled && !last_was_esc && copy_buf_len - input_buf_ptr >= 4 * sizeof(Vector8))
{
- Vector8 chunk;
- Vector8 match = vector8_broadcast(0);
- uint32 mask;
-
- /* Load a chunk of data into a vector register */
- vector8_load(&chunk, (const uint8 *) ©_input_buf[input_buf_ptr]);
+ Vector8 chunk1,
+ chunk2,
+ chunk3,
+ chunk4;
+ Vector8 match1,
+ match2,
+ match3,
+ match4;
+ Vector8 tmp1,
+ tmp2,
+ result;
+
+ /* Load four chunks of data into vector registers */
+ vector8_load(&chunk1, (const uint8 *) ©_input_buf[input_buf_ptr]);
+ vector8_load(&chunk2, (const uint8 *) ©_input_buf[input_buf_ptr + sizeof(Vector8)]);
+ vector8_load(&chunk3, (const uint8 *) ©_input_buf[input_buf_ptr + 2 * sizeof(Vector8)]);
+ vector8_load(&chunk4, (const uint8 *) ©_input_buf[input_buf_ptr + 3 * sizeof(Vector8)]);
if (is_csv)
{
/* \n and \r are not special inside quotes */
if (!in_quote)
- match = vector8_or(vector8_eq(chunk, nl), vector8_eq(chunk, cr));
+ {
+ match1 = vector8_or(vector8_eq(chunk1, nl), vector8_eq(chunk1, cr));
+ match2 = vector8_or(vector8_eq(chunk2, nl), vector8_eq(chunk2, cr));
+ match3 = vector8_or(vector8_eq(chunk3, nl), vector8_eq(chunk3, cr));
+ match4 = vector8_or(vector8_eq(chunk4, nl), vector8_eq(chunk4, cr));
+ }
+ else
+ {
+ match1 = vector8_broadcast(0);
+ match2 = vector8_broadcast(0);
+ match3 = vector8_broadcast(0);
+ match4 = vector8_broadcast(0);
+ }
- match = vector8_or(match, vector8_eq(chunk, quote));
+ match1 = vector8_or(match1, vector8_eq(chunk1, quote));
+ match2 = vector8_or(match2, vector8_eq(chunk2, quote));
+ match3 = vector8_or(match3, vector8_eq(chunk3, quote));
+ match4 = vector8_or(match4, vector8_eq(chunk4, quote));
if (escapec != '\0')
- match = vector8_or(match, vector8_eq(chunk, escape));
+ {
+ match1 = vector8_or(match1, vector8_eq(chunk1, escape));
+ match2 = vector8_or(match2, vector8_eq(chunk2, escape));
+ match3 = vector8_or(match3, vector8_eq(chunk3, escape));
+ match4 = vector8_or(match4, vector8_eq(chunk4, escape));
+ }
}
else
{
- match = vector8_or(vector8_eq(chunk, nl), vector8_eq(chunk, cr));
- match = vector8_or(match, vector8_eq(chunk, bs));
+ match1 = vector8_or(vector8_eq(chunk1, nl), vector8_eq(chunk1, cr));
+ match2 = vector8_or(vector8_eq(chunk2, nl), vector8_eq(chunk2, cr));
+ match3 = vector8_or(vector8_eq(chunk3, nl), vector8_eq(chunk3, cr));
+ match4 = vector8_or(vector8_eq(chunk4, nl), vector8_eq(chunk4, cr));
+
+ match1 = vector8_or(match1, vector8_eq(chunk1, bs));
+ match2 = vector8_or(match2, vector8_eq(chunk2, bs));
+ match3 = vector8_or(match3, vector8_eq(chunk3, bs));
+ match4 = vector8_or(match4, vector8_eq(chunk4, bs));
}
- /* Check if we found any special characters */
- mask = vector8_highbit_mask(match);
- if (mask != 0)
+ /* Combine results to check if any chunk has special characters */
+ tmp1 = vector8_or(match1, match2);
+ tmp2 = vector8_or(match3, match4);
+ result = vector8_or(tmp1, tmp2);
+
+ if (vector8_is_highbit_set(result))
{
/*
- * Found a special character. Advance up to that point and let
- * the scalar code handle it.
+ * Found a special character somewhere in the four chunks.
+ * Identify the first chunk containing it.
*/
- int advance = pg_rightmost_one_pos32(mask);
+ uint32 mask;
+ int advance;
char c1,
c2;
bool simd_hit_eol,
simd_hit_eof;
+ mask = vector8_highbit_mask(match1);
+ if (mask == 0)
+ {
+ input_buf_ptr += sizeof(Vector8);
+ mask = vector8_highbit_mask(match2);
+ }
+ if (mask == 0)
+ {
+ input_buf_ptr += sizeof(Vector8);
+ mask = vector8_highbit_mask(match3);
+ }
+ if (mask == 0)
+ {
+ input_buf_ptr += sizeof(Vector8);
+ mask = vector8_highbit_mask(match4);
+ }
+ Assert(mask != 0);
+
+ /*
+ * Found a special character. Advance up to that point and let
+ * the scalar code handle it.
+ */
+ advance = pg_rightmost_one_pos32(mask);
input_buf_ptr += advance;
c1 = copy_input_buf[input_buf_ptr];
/*
- * Since we stopped within the chunk and ((copy_buf_len -
- * input_buf_ptr) > sizeof(Vector8)) is true,
- * copy_input_buf[input_buf_ptr + 1] is guaranteed to be
- * readable.
+ * Since we stopped within the block and ((copy_buf_len -
+ * input_buf_ptr) >= 4 * sizeof(Vector8)) was true at the
+ * start, copy_input_buf[input_buf_ptr + 1] is guaranteed to
+ * be readable.
*/
c2 = copy_input_buf[input_buf_ptr + 1];
simd_hit_eol = (c1 == '\r' || c1 == '\n') && (!is_csv || !in_quote);
@@ -1438,8 +1506,8 @@ CopyReadLineText(CopyFromState cstate, bool is_csv, bool simd_enabled)
}
else
{
- /* No special characters found, so skip the entire chunk */
- input_buf_ptr += sizeof(Vector8);
+ /* No special characters found, so skip the entire block */
+ input_buf_ptr += 4 * sizeof(Vector8);
continue;
}
}
--
2.47.3
view thread (21+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
In-Reply-To: <CAN55FZ3g6QaiC8G4GMjdJ24egvgc-HG_xpoOztxnM_wnQNn5aw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox