Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w14uo-002XSx-1w for pgsql-hackers@arkaria.postgresql.org; Fri, 13 Mar 2026 15:59:06 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w14um-004xlx-2b for pgsql-hackers@arkaria.postgresql.org; Fri, 13 Mar 2026 15:59:05 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w14um-004xlp-1L for pgsql-hackers@lists.postgresql.org; Fri, 13 Mar 2026 15:59:05 +0000 Received: from mail-dy1-x132a.google.com ([2607:f8b0:4864:20::132a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w14uk-00000001yUr-4BYk for pgsql-hackers@postgresql.org; Fri, 13 Mar 2026 15:59:04 +0000 Received: by mail-dy1-x132a.google.com with SMTP id 5a478bee46e88-2bea03c64c1so1237689eec.0 for ; Fri, 13 Mar 2026 08:59:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773417542; cv=none; d=google.com; s=arc-20240605; b=ayB9ILEBUr5LaZRgjZ1gniw3MQWBJpfXRwVEnAs/mvSKwNRw4lvDCHZCMRiotoR1Rb Jv9Rb/kn2lst+MRTO8t364MGZdYqfNq889CFvmJiJMwHsqkSkhqLaLYdSW2yegXquWJ1 Oz3XNd+gON96l3FAkmMpzvU/NN5rNKSMHkCwmpaZfNwiMje62db4ogHvpFIQCOvDIrSI 1nI7RwvMVP4qlCUPsnrjzovlNOHdX5eeiYoFCmoRy9RyRkm3QvpbJP30bM2Xlc2RQqz9 cuYxNbWK+tdjj4qF5IN6EOHjp15CKNxZMeCJwdcq+PD+BRcesCqboHqUkLJLR4HkvefD bwZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=71O+eRWUPLdjOupx9Ri9BVE7aPXbaOajt7Gz+iJTnh4=; fh=LhYFcqapPdWhMLtkGIxa1jYtKMYKCTJZgL5uKvfFwcw=; b=K/EYVy9dVytNqeNrNT6uI11PGpVRC2bL2SGvyDory55yOvZG31ouvAKX1+5A5pp9ju RfwH42KKxUblaKut6HbH29WJEq3EX7o9JPJ6Hs/seRDXwuiqnUYrDUI1E0phY10dW8QQ Pod/bWVj+7jcs4MCGRYbsRc1dm8L9QVEfKX6ywDUyYVbQBlTi4feupYJ+j8UM4MXtAJh GmrORBG6MX3MxrV33eWmxMW93cHPdrirrJcHihiZQjB9BmwCfq/wozxfnJjjq533mbUM mdx2ZQKpxa3POySVbIOp5I8/bKpZu3K6GCo3MgTPFUS/CkIf8NyfIcfD7bIseffUvPR9 rAtw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773417542; x=1774022342; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=71O+eRWUPLdjOupx9Ri9BVE7aPXbaOajt7Gz+iJTnh4=; b=TXPjpeu/YgGteFrCF+08bsEIULAl9jd+KaPUd15V4WpBatgA0DQKB0eMpb2PB9Gkiy yvIoJgbsMegba7Gv1ro1AD3nYcWckOgyPetO0biacDJx2v5xaeeUiwvq60OLJIb3s3F+ 96lZekcw1QslsC0+xA37jEWbL8j2CF1L3WWtTK/+NHSBkhC3Ntdl05mt/ov98uL6r7xl CS4tLpy1vpoaIkb9BJ/UFWrF+cvho+ang2OH1xxcGByevq3WC1xBMducVGQ92Uceg4Y1 6b0jTRMAu/XOiXffvM5XDav5dGtgGP3RiQA6TQQQAU5dmd90Exy4pm6KHC+lzbrP4ew2 0FzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773417542; x=1774022342; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=71O+eRWUPLdjOupx9Ri9BVE7aPXbaOajt7Gz+iJTnh4=; b=s4x2+Xz2kgV06TC7k8fyxZWcwAwDgWjNws3ZbSVn8yCpwC5p+3cfHQrfK0DBWoHTg2 WMLhhKo2DmK+TAFtLTU+Nc4pv4VTOZN7TwLOLcpFNADOmfSO2yLZny89Wr5QjRk6+PEh EUeJ3KdwuG0uTIfRlhfVsb2k/ja3i48qa/fdcOpPYDg9zO5SyPNdMNcwGbMNWhTe4xg0 Y7YPLJLdgeVPig6skH8ZUeQrdzz2nEdRWcXoS3QZ33IzyyAX2/XDzkyYXCqSU7zvknd+ NpACM7RxwuPJh7VnlGeKYQHbRuV72rEwo81zOOVsDBNFnSXBfWd76Op6raeh9qhWE7MP bCDg== X-Forwarded-Encrypted: i=1; AJvYcCXFvnR0Z3yxG9J3bQg8hMnHVW6R227CK5olFHG9is7Qf68LUAf/iwImUHTSWAfFd1kLI48nM9Bh/rB9ljQV@postgresql.org X-Gm-Message-State: AOJu0Yz71esFYeVuAH65d/h2Q0kmrFUm3PZl6VRh5nYkZygIJXs/vJg6 AmdJVRJKHz4sGuUNBT3pJ8jLwI7bde4k8m7KBzBJFKCOEJRzPNJ3xjV5NQI3qRPipE11KzR1CZX V5uFfyVTbdzGoj0AFlLKgbvRWGjT/md4= X-Gm-Gg: ATEYQzxunD20SrsUn7JhPJWWZqgAOwtLi6HkyqfPSOHgKchuc0tlgUTfzEqyxw/99ah LdYTFr22Qws2yzmG9GsDMc/OzBxs57IDwtW06d0zN8YwTj/tMdxRRyJGUTQPh0J4fqkGdHu0vWY iixvwpPlPJT4z+Q6uWRmza17ff3D8wbgzlho471FkrdkbkuNBa64fBol8l/hzI4O901I2hgrWus Cj+aEZ66wNN/Szj7KehtBOnpWKzi/ppBG29JzHDolLVbYO3QYMJa0B9ymrdgeXW9BmByIrRjaBG jJoQWRk= X-Received: by 2002:a05:693c:6319:b0:2be:b458:f7e2 with SMTP id 5a478bee46e88-2beb458fa5amr439733eec.8.1773417542102; Fri, 13 Mar 2026 08:59:02 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Nazir Bilal Yavuz Date: Fri, 13 Mar 2026 18:58:49 +0300 X-Gm-Features: AaiRm53lArNW8-gCJKB0tyUs7MMeWzVz-H1Uh3sZMzKXhLNrgfptxglcvDkdKmw Message-ID: Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD To: Nathan Bossart Cc: Manni Wood , KAZAR Ayoub , Neil Conway , Andrew Dunstan , Shinya Kato , PostgreSQL-development Content-Type: multipart/alternative; boundary="000000000000a2b063064ce9f2ec" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000a2b063064ce9f2ec Content-Type: text/plain; charset="UTF-8" Hi, On Fri, 13 Mar 2026 at 17:05, Nathan Bossart wrote: > > On Fri, Mar 13, 2026 at 04:34:49PM +0300, Nazir Bilal Yavuz wrote: > > On Fri, 13 Mar 2026 at 14:57, Nazir Bilal Yavuz wrote: > >> Unfortunately, v15 causes a regression for a 'csv & wide & 1/3' case > >> on my end. v14 was taking 8000ms but v15 took ~9100ms. If we add the > >> tmp_hit_eof variable then the regression disappears. Also, if I use a > >> struct like below, regression disappears again. > > > >> When I removed the tmp_hit_eof variable on v14, I didn't encounter any > >> regression. I really don't understand why this is happening on my end. > >> Manni didn't encounter any regression on the benchmark [1]. > > > > Problem might be related to gcc. I am using Debian Trixie and my > > current gcc version is 'gcc version 14.2.0 (Debian 14.2.0-19)'. If I > > compile Postgres with 'Debian clang version 19.1.7 (3+b1)', then there > > is no regression, which makes more sense IMO. > > Let's just re-add the temporary variable for hit_eof. The struct idea is > clever, but it's just a little more complicated than I think is necessary > here. > > I've also removed the goto in favor of just duplicating the "out" code, > like you had before. I'd like to avoid sporadic #ifndef USE_NO_SIMD uses, > and goto is out of fashion, anyway. Thanks! v17 LGTM. I didn't encounter any regressions. -- Regards, Nazir Bilal Yavuz Microsoft --000000000000a2b063064ce9f2ec Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

On Fri, 13 Mar 2026 at 17:05, Nathan Bossart &l= t;nathandbossart@gmail.com&= gt; wrote:
>
> On Fri, Mar 13, 2026 at 04:34:49PM +0300, Nazir = Bilal Yavuz wrote:
> > On Fri, 13 Mar 2026 at 14:57, Nazir Bilal Y= avuz <byavuz81@gmail.com> w= rote:
> >> Unfortunately, v15 causes a regression for a 'cs= v & wide & 1/3' case
> >> on my end. v14 was taking= 8000ms but v15 took ~9100ms. If we add the
> >> tmp_hit_eof va= riable then the regression disappears. Also, if I use a
> >> st= ruct like below, regression disappears again.
> >
> >>= When I removed the tmp_hit_eof variable on v14, I didn't encounter any=
> >> regression. I really don't understand why this is hap= pening on my end.
> >> Manni didn't encounter any regressio= n on the benchmark [1].
> >
> > Problem might be related = to gcc. I am using Debian Trixie and my
> > current gcc version is= 'gcc version 14.2.0 (Debian 14.2.0-19)'. If I
> > compile= Postgres with 'Debian clang version 19.1.7 (3+b1)', then there
= > > is no regression, which makes more sense IMO.
>
> Let= 's just re-add the temporary variable for hit_eof.=C2=A0 The struct ide= a is
> clever, but it's just a little more complicated than I thi= nk is necessary
> here.
>
> I've also removed the got= o in favor of just duplicating the "out" code,
> like you h= ad before.=C2=A0 I'd like to avoid sporadic #ifndef USE_NO_SIMD uses,> and goto is out of fashion, anyway.

Thanks! v17 LGTM. I didn&= #39;t encounter any regressions.

--
Regards,
Nazir Bilal Yavu= z
Microsoft
--000000000000a2b063064ce9f2ec--