Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vB66b-00AYHB-Vb for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Oct 2025 06:44:25 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vB66Z-0071tv-HE for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Oct 2025 06:44:22 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vB66Z-0071tj-5v for pgsql-hackers@lists.postgresql.org; Tue, 21 Oct 2025 06:44:22 +0000 Received: from mail-ed1-x52f.google.com ([2a00:1450:4864:20::52f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vB66W-002vz5-1a for pgsql-hackers@postgresql.org; Tue, 21 Oct 2025 06:44:21 +0000 Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-63c523864caso6596591a12.1 for ; Mon, 20 Oct 2025 23:44:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=esi.dz; s=google; t=1761029059; x=1761633859; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=m68epUizQvC8JGquVOd8DcANBD+eMlhoqCWnQiOMjzo=; b=hU61bV+dvzRsYqhcU9Nv8/3AmhgvaAf1mkeEvADZLMJyXFyWWv8+0gOwBxB7cOJpJI 11U6ai4qxTfkQtT5gM9ZRHRt14iYKUHWyc3NOXwf5+nJYxuJqfcFfUOyElo2f5ruBCq0 z/ES8qm+QgJndNRPcnLqNxp0Z22dzwJ2QZFEBrZakgSVGkMDvjmcwuDnQeCcbk95sMk5 LEWLbaPBn+P1KB6Lm0S/7ZRc8wAfT2g08kj1ZYkk9PRB+gxf5HkJ91CXiUZTrd7sZMLw zqNPf9ol+2PoVJ9xFv4LQTqOG9I6P7ACnHKfSPWlkMlXxZx7EEEgO8tY21osZuJ22SHc OnOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761029059; x=1761633859; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=m68epUizQvC8JGquVOd8DcANBD+eMlhoqCWnQiOMjzo=; b=RnR/lKgBwhaCm0BjrJFlynWDA73fXGxvcdafYSJCMxK0luZOZ2XXZ2ud7DqWn0+CBL FdiP7uM+nXxkg2FGewG/yHhLB9cKdt0RpYxH4zJHJ42l5prj1zXLXQcyRpxEeK+jJn8I Vp+Bp4lEtOnoTfgafjzIlSD09RIgXzN0fRktPvabHOdGBigZi+GO1i29lw1XrUKKjucM RHW3qX+BbeCrNBsF1bje26Hro/xKzrRkcdkt4WlSO6g/fZvCYM2s8yxTqVG50xv5Jx86 imdwYGzEkYjjtnAoXH/oPBuLPDREZSEkx+cmnUvdKVs7bS9x9jQlt6Dc6qaAiyyX1t5C +rbA== X-Forwarded-Encrypted: i=1; AJvYcCW+an1bvV/gR/8FT2U6dXwo/m6QSQH+x/8NFgRxlRLUVvq+kqWi0SYEqqt+wRLn7BR/qcfBCoa3dPrdbAiA@postgresql.org X-Gm-Message-State: AOJu0Yw/vGriSeqkbKm6cygKAkvCH3I1K3mdHGRvEVNDHNNOt0WZUuIq 6b5OQyv3qgzloqhoDuTbpRVFr6vG7CAu6MQM7R9RiGXqFxZn9Tl/IwKTfdBtv4Hq5rX79VuqEN0 aTpinDO1Ek+c77txLMYGyYeONtlMcwOFM0fym/oh/ X-Gm-Gg: ASbGncvjW+GBflKNYmxzMKHH+HhQ42PMXSHImkzPKVn2cUGMm2ERBiXGQeVNRVRpSCd tU9U9+Jl6TayYxZhDt3GGgVyfW1twHaGoZFHrpdvKZzGC/7NlE89CgVbGbpFzRhjLb3/HLrwSSt Rpm3hXDQvhMOjGEIQgxWDSKoWiy9mzXzoLvkRWvM6YXqvEUETXglXpscs2e1OcWIYCLxJNgIv+N PVxTmU5xzLqhCpg2KlAj0VAvFptSSlRmr3F3qNAWrKP7JrACDKssQKRwC0= X-Google-Smtp-Source: AGHT+IFzx7Xsk0VWFzgKo6osp2SZDU3ATx8zARt5PeO0b4uj290wMGvnYbMd5cUHVQFcbktl2JXXq1TxYvvNdn0VB3I= X-Received: by 2002:a05:6402:3552:b0:63c:533f:4b25 with SMTP id 4fb4d7f45d1cf-63c533f525fmr10111707a12.15.1761029058238; Mon, 20 Oct 2025 23:44:18 -0700 (PDT) MIME-Version: 1.0 References: <8615c983-1662-43b4-b0c9-49d194ac33aa@dunslane.net> In-Reply-To: From: KAZAR Ayoub Date: Tue, 21 Oct 2025 08:44:06 +0200 X-Gm-Features: AS18NWAX56uhymUHs_uK2qm8UNQf4JtboLfNc5JfL-p18zbajUGVwmSYUzL9D-I Message-ID: Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD To: Nazir Bilal Yavuz , nathandbossart@gmail.com, ants.aasma@cybertec.at Cc: Andrew Dunstan , Shinya Kato , Pg Hackers Content-Type: multipart/alternative; boundary="00000000000074b3a50641a587c4" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000074b3a50641a587c4 Content-Type: text/plain; charset="UTF-8" On Tue, Oct 21, 2025, 8:17 AM KAZAR Ayoub wrote: > > Currently we are at 200-400Mbps which isn't that terrible compared to > production and non production grade parsers (of course we don't only parse > in our case), also we are using SSE2 only so theoretically if we add > support for avx later on we'll have even better numbers. > Maybe more micro optimizations to the current heuristic can squeeze it > more. > > > [1] > https://branchfree.org/2019/03/06/code-fragment-finding-quote-pairs-with-carry-less-multiply-pclmulqdq/ > [2] > https://github.com/AyoubKaz07/postgres/commit/73c6ecfedae4cce5c3f375fd6074b1ca9dfe1daf > [3] https://agner.org/optimize/instruction_tables.pdf > [4] https://www.uops.info/table.html > > Regards, > Ayoub Kazar. > Sorry, I meant 200-400MB/s. Regards. Ayoub Kazar. > --00000000000074b3a50641a587c4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Tue, Oct 21, 2025, 8:17 AM KA= ZAR Ayoub <ma_kazar@esi.dz> wr= ote:

Currently we= are at 200-400Mbps which isn't that terrible compared to production an= d non production grade parsers (of course we don't only parse in our ca= se), also we are using SSE2 only so theoretically if we add support for avx= later on we'll have even better numbers.
Maybe more micr= o optimizations to the current heuristic can squeeze it more.
[4]=C2=A0https://www.uops.info/table.html<= /div>

Regards,
Ayoub Kazar.
Sorry, I meant 200-400M= B/s.


Regards.
Ayoub Kazar.
--00000000000074b3a50641a587c4--