Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vAD7f-00Fgm9-Ie for pgsql-hackers@arkaria.postgresql.org; Sat, 18 Oct 2025 20:01:50 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vAD7e-008txX-Gb for pgsql-hackers@arkaria.postgresql.org; Sat, 18 Oct 2025 20:01:49 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vAD7e-008tvH-5F for pgsql-hackers@lists.postgresql.org; Sat, 18 Oct 2025 20:01:49 +0000 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vAD7b-002VP9-05 for pgsql-hackers@postgresql.org; Sat, 18 Oct 2025 20:01:48 +0000 Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-b6271ea3a6fso1972951a12.0 for ; Sat, 18 Oct 2025 13:01:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1760817706; x=1761422506; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=EkWNOJELU6K2+AKLYH6H1aWnSNtoQ+nW3GaXc0hvM2k=; b=RS7D5pXpoTL6ETr+WC3BMYGtVsbHGu+IBhJh3CgR6P0ds0VJbSbPHLEr67sOL/KA44 uFM0Jtl65e3+kglcQ90hVrjtDcO0ZUJVKyqr0LlYi8QLRUbjA5LEmgLQtpXs/xDqG02K L56Ogx41aurweAjihG+8+SqhM/8Au3sgQmxcoMtdhMQdPtFPc75fnhvZB6h+bLMPf/Tv L4ffUIaTC8K+PYfDSbTT8XgHkHB6goeawUC0j4jtfy6jS8hlBAonf2erCcYOdBD+jWbm J6+qfh41GiyhMTMe6b6ppN7KrBrxcR108301Y0l2tkGaiI9Ido1AQG5yYeg6WV9GuDtV 8wBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760817706; x=1761422506; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EkWNOJELU6K2+AKLYH6H1aWnSNtoQ+nW3GaXc0hvM2k=; b=qmRxWwc6IBr7OMcxP6X/Lifv7FKbl+5dF/4ObWLJK+LKUrBFxmfgC5NWCKLYGBwmiQ kv1xhFcQtn5NQFjo/17Vhoe9rSnsnURVTn5jbWYctQOhBZar8yeOH2alZknr9NcagvRd tGCi6tNohVIBN+P5+yiSJvFnJM31o9PMipN9sXGZu6rNNSbXpwjj+YxMT2pIMsubvb5X r7VBDWhmzzsW6Se/8u86bSUYt0UzvX1ePRVrDrBRtX2+kjINrIkXHdQ+KP3wsV1BtiBb LuY0WblYmsH8MzlFUUsTTm0zhgh7+SpD53iSwHlqHOEGHw3wzYb2zAJf1STaXyRuhq/u KU1g== X-Forwarded-Encrypted: i=1; AJvYcCUMMBydN/BRf338Y8hpxkzUpqQcD74rbJzTaGLvAobfUFis/QOR9Ov6wk8nWSyIais0S9ICeagmXv2e8maa@postgresql.org X-Gm-Message-State: AOJu0YwZRFoQZITaB+DqmWf4B4oGL9Mnkje+02BmDijcsiyPDoKZERT2 6QS7NKIGYf7FjJWCVYDpsPLR7XNeq9gAzogbS3xIn/1bJHO9lNn//QfiQePQHrCHaZKNSBCCy4R nLeY+AUPpMLpDrVxkd3Vh7UlQMOjzleQ= X-Gm-Gg: ASbGncvYWCoJX4IeQXWFdr7CZ4cmGW++q+55Aw/WMfhJaS08HCh+TX9bhe/NcUaX2Mq TQyO/RkVdl2paxcEeUwl2qFogJy+Etmjdv9yZUJyqXhGCUNjLgCJu7pp21R7Tyg9Y8W9WungY70 S01cUNBYHhI+Su2iGlBeq1UzgCYmMeow+ZnfDSaKP0sgm2vesJ5if/IJxRIoaH21ET4EbgoNVGB zVZUTx8mVbFAwJfEucc/iOQXi6C2nxRfyze+BXek5d0nYulrJtL3xcRnG1KTNVZVjTY7Q== X-Google-Smtp-Source: AGHT+IFRYNBCrQhDbFugX+lFkb45U8CHN2H7njw+auJ0AjJLwrRqqQhsFGZDQunNTwQjfeaLGhsjQklaggRXj8wSy7I= X-Received: by 2002:a17:903:94e:b0:290:533b:25c9 with SMTP id d9443c01a7336-290c9c8ae4dmr94321305ad.2.1760817705938; Sat, 18 Oct 2025 13:01:45 -0700 (PDT) MIME-Version: 1.0 References: <8615c983-1662-43b4-b0c9-49d194ac33aa@dunslane.net> In-Reply-To: From: Nazir Bilal Yavuz Date: Sat, 18 Oct 2025 23:01:41 +0300 X-Gm-Features: AS18NWD3o1b6bELylf0QMVGmfCG7ageQblGInMI7uS1rqFPbi2GjqH5VtOCD7XY Message-ID: Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD To: KAZAR Ayoub Cc: Andrew Dunstan , Shinya Kato , pgsql-hackers@postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On Sat, 18 Oct 2025 at 21:46, KAZAR Ayoub wrote: > > Hello, > > I=E2=80=99ve rebenchmarked the new heuristic patch, We still have the pre= vious improvements ranging from 15% to 30%. For regressions i see at maximu= m 3% or 4% in the worst case, so this is solid. Thank you so much for doing this! The results look nice, do you think there are any other benchmarks that might be interesting to try? > I'm also trying the idea of doing SIMD inside quotes with prefix XOR usin= g carry less multiplication avoiding the slow path in all cases even with w= eird looking input, but it needs to take into consideration the availabilit= y of PCLMULQDQ instruction set with and here we go, it quickl= y starts to become dirty OR we can wait for the decision to start requiring= x86-64-v2 or v3 which has SSE4.2 and AVX2. I can not quite picture this, would you mind sharing a few examples or patc= hes? --=20 Regards, Nazir Bilal Yavuz Microsoft