Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vABx2-00FPx2-3x for pgsql-hackers@arkaria.postgresql.org; Sat, 18 Oct 2025 18:46:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vABx0-008jsd-Hj for pgsql-hackers@arkaria.postgresql.org; Sat, 18 Oct 2025 18:46:45 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vABx0-008jsV-7G for pgsql-hackers@lists.postgresql.org; Sat, 18 Oct 2025 18:46:45 +0000 Received: from mail-ed1-x534.google.com ([2a00:1450:4864:20::534]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vABwx-002Uyf-0B for pgsql-hackers@postgresql.org; Sat, 18 Oct 2025 18:46:43 +0000 Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-634cef434beso6918784a12.1 for ; Sat, 18 Oct 2025 11:46:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=esi.dz; s=google; t=1760813201; x=1761418001; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=wMb6KWoB69FBpTwSSWngfWlrTHf+9wc+8PqSaa7RdIg=; b=BJLioKF0CQ7y3wZhTr9tzulWvzQgcr19nOkSSzU1d2ChadhD/2g75+M/Bebm1XhOX8 i1E9g0qZS51vdyNNnbJWYkapGTDMatMfy5y/N5eGf8Yw8rHtntgkqcNIEK+88pRjOnj7 PkA9gILfDpRY0LKj+Z9v0HVJgigbyneiHFBt5MPELq4Gfkmm+MgYS+lwwf4FccewQWwU ROnPQvA48+AjV7BxKtAHicOU1mJv5kZyHwM1p34X4iq2faFzWBaT8bZFG7l2flToN/YK zGFUqYvBStSZ1ez5Z5S1qgg1fWXEDfxLUuu30cxS/8aKLXBhvN2yCitoZTCFNPgNKJgp 4r/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760813201; x=1761418001; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wMb6KWoB69FBpTwSSWngfWlrTHf+9wc+8PqSaa7RdIg=; b=sDenLht7jTkAdpa9VRI8wRWR/Z4Ni0ZHR8riJgqUc1mrQyiPIZnx270oQ6kfQF7VsW jJkGOVdkfk6JJKbqFwpdsY5Mh/uN+I6gOdp8/xNbaixhyujKK5eQzypavDbMJztj6Pek p6OYqL2peyeL97eNxWdWOPc2bia99NavhkWGwF8SlFLu/7LrKrlww7irSiNe6e68TfnR Bktzw0QiR5yDS4gI4rUddB82b6sOY0QfzuiQ5kYMNyNrSUVuMR4rpgtY9OVbpIrUyYpa 4vzHEIIwjd0SAL4E54i9Ydnt3w90JEpyV5By06Cqn/RyMfLMSNFQl7gRGrDv4D/dlGQx 7rJg== X-Forwarded-Encrypted: i=1; AJvYcCVFHcuspTnRZzbljnzfWMM1Oz9GmufMup1TGes4VWSLv7iECwEw7Po9lYmtrnV9n4q6uX7rN8+tfb5/wkSI@postgresql.org X-Gm-Message-State: AOJu0YyHQm1DS73A4rKH61HsbcFX2JiR13Fa7YElKlnn1FTQ1QuUKDdg hqa8BT/nhOQxUO2TnBircxmZDFxYDV9cT+gSWiSQI8vh4AK0PublkxfsxHTdK34RgQ1106Xha1G 7xJkXCLpwud8vVNPZYPKDhW2W1sHgT9F8g7PmZM1d X-Gm-Gg: ASbGnctslo+hrdQRXrmXqm2g0mgtDYTs1uCDJRx486CC2an9gw5wpFvR4gPmj2qDJiq IFEisXBWgA53NLhrq1FEYEWzWGeN2tjH69wFnztTxVAVq0GPsFD9T/AUY/xyQAHuGLjhU/Lhvwj CNeuQv12iNwL5iW1G2KhBSqV5d7jIxY/S4jRwRFMG+ol4yCJVZUe9hYQGz7UBNkDnyMqcIoaZR3 Q/A5m2n3oiVZM/7pq0JVd46d6S103TaVUj2d/KT2q0Y8jrcX6IaetXfoFCT7yhnTAqxPf3QKJw0 YJEhtzIq/zhoa0OhXQ6xOa2iwG+j2/8HT+fO9fI= X-Google-Smtp-Source: AGHT+IEQbPACJ2YVR5QCx3sqaPWuvHMjYpiSUejFgjTxbjtAaPA0n0CPgoLzL1MT1NdUtZi4OBuMrwsNN3wEDZlyWgg= X-Received: by 2002:a05:6402:4602:20b0:63c:12dc:59e3 with SMTP id 4fb4d7f45d1cf-63c1e2ac5c5mr6527979a12.19.1760813201132; Sat, 18 Oct 2025 11:46:41 -0700 (PDT) MIME-Version: 1.0 References: <8615c983-1662-43b4-b0c9-49d194ac33aa@dunslane.net> In-Reply-To: From: KAZAR Ayoub Date: Sat, 18 Oct 2025 20:46:29 +0200 X-Gm-Features: AS18NWB8nuqj7emfe_jDijZ-KxCNJaERpftpCI6eEwKhYVqv4ayTkRFaG-kUBEc Message-ID: Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD To: Nazir Bilal Yavuz Cc: Andrew Dunstan , Shinya Kato , pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="0000000000005e9b6e0641734596" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000005e9b6e0641734596 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello, I=E2=80=99ve rebenchmarked the new heuristic patch, We still have the previ= ous improvements ranging from 15% to 30%. For regressions i see at maximum 3% or 4% in the worst case, so this is solid. I'm also trying the idea of doing SIMD inside quotes with prefix XOR using carry less multiplication avoiding the slow path in all cases even with weird looking input, but it needs to take into consideration the availability of PCLMULQDQ instruction set with and here we go, it quickly starts to become dirty OR we can wait for the decision to start requiring x86-64-v2 or v3 which has SSE4.2 and AVX2. Regards, Ayoub Kazar --0000000000005e9b6e0641734596 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

I=E2=80=99ve rebenchmarked the new heuristic patch, We still have the pr= evious improvements ranging from 15% to 30%. For regressions i see at maxim= um 3% or 4% in the worst case, so this is solid.

I'm also trying the idea of doing SIMD inside quotes wit= h prefix XOR using carry less multiplication avoiding the slow path in all = cases even with weird looking input, but it needs to take into consideratio= n the availability of PCLMULQDQ instruction set with <wmmintrin.h> an= d here we go, it quickly starts to become dirty OR we can wait for the deci= sion to start requiring x86-64-v2 or v3 which has SSE4.2 and AVX2.


Rega= rds,
Ayoub Kazar

--0000000000005e9b6e0641734596--