Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w138l-002Vq9-0W for pgsql-hackers@arkaria.postgresql.org; Fri, 13 Mar 2026 14:05:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w138i-004Vml-28 for pgsql-hackers@arkaria.postgresql.org; Fri, 13 Mar 2026 14:05:21 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w138i-004Vmd-0o for pgsql-hackers@lists.postgresql.org; Fri, 13 Mar 2026 14:05:20 +0000 Received: from mail-oi1-x22d.google.com ([2607:f8b0:4864:20::22d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w138g-00000001xjm-0DkX for pgsql-hackers@postgresql.org; Fri, 13 Mar 2026 14:05:19 +0000 Received: by mail-oi1-x22d.google.com with SMTP id 5614622812f47-46708149af2so1246754b6e.0 for ; Fri, 13 Mar 2026 07:05:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773410716; x=1774015516; darn=postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=fito5iEeXDxvGrgko9tJV4XGTmTPI+48Qt3Np4L9Kjo=; b=ZCLIJJLC/Cal+q2iuUJgzzgwJg1FLcn6/vTUmsb5oLSX3xd4vFnvlj45zT/fsYtaDj iW2Y3hPuNNe4urrtmie1ui4Q8kCJi1ZTYT07VLNTNjgKs+D6MfFhizgdASJuo5VuygU7 wutYJ0LsUFvIQqR7OV3lQ2lFrCFMVCI3fXtohutg3SQuC9wyI3s0+i4wuQ59s7CDzzEs md7c5LGelPOUOdgEBsDrrZVWemyrqeoVM/dkuem4JT8iYqyGO0y6nVwjyzmUd7/BWcN5 SJyP6FBfgpoh7UuGMlaDF84z0RYh9YK62rQmhhizdfoQpqx8NxRMiJ+D+7I35vvObl5n xxRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773410716; x=1774015516; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fito5iEeXDxvGrgko9tJV4XGTmTPI+48Qt3Np4L9Kjo=; b=MDX0iCBwx2rdCztqHEoDEHKJTuU5rpgyjn0TciI6bMgGEtYtCTFl/2j62uh9JGM0PB tXmfVqF/6DArZjrNe2Riop3p0kBaA1tiM4NmUsXLdpdCOxJiqYk1rmwubVe6/9AXaItR 4GIFzhM+HNEXx8NGNvJRSHvkdl1xkLcQ+SqG/o6F/ifkiJKwHcyb7ZVBU+Mqpilf4q+S YRrLp95CQ30JC1CljhoYauwwj0s2cRxjriZ1s74Xbs9TBVC3u8KOVaYdj7+h02xiT1me S33FRjAFuDK+eQ80esdtgfX0UPvpKadXfCgyd8MarTkfSdfLMOza/wdT1rEgolpvgGbN QeUg== X-Forwarded-Encrypted: i=1; AJvYcCVZwzzdvnQHlOo+9x6pPpIb4Tbzq6OZtz7HPjDOGaeImSPUBNyj+0c+tWKTIL2/8YRndYyIaeV19S1m0EuF@postgresql.org X-Gm-Message-State: AOJu0Yx4BqV2gc8kV7MfK85EruioEzSv4VStIYYT8nZ9KZ0a0x4LmobJ mAMs8umk7Q0USgjj5dF+DdpRwXKahXpEDzbAbYycO3gt3FAeGVD9W50s X-Gm-Gg: ATEYQzxjGp8egcEIi3NTA5anZax58pm/u5fSbU3mOPtM2TvccnJ8LiKvy3+VkTC2SLr qknUOfmDi+X0ePMt4MY+5LUFp0A434llIbJfD7janEW1jNYQXRGNbIy7QcVj1aM0P4cZBouQ4x3 YPhxAswCjwm87mYmpH1h8aBEbon6YEYRx05q4Gy1ujIN9GcMVydGh/TzcqaUyb/NdEsJOw0K5O/ uZV4ojtYMDXxvyTj1XjQD/v7KxSMmbiWCUEtE9+eZE3BxWXp9+uV+1jVMOPeiQImlt2jayjhBzO wDQRguQuFn+E8Ti95abjX5LgkhbdHG5S8HKKc/dHiZDFUpHWGaw0fdFVNU2kIo70KMgEx0MzLB8 D94fbdRVIBNrAuvQOUK8BeHUE+Mbxq9SxPglgW34h+2pMCQdzDDWQ0Ww6c0s8eo3KNvemx89uXt Jtv0Lm18BdESlBuUakt8LNv1ooqn1SuJGYgejZ7WXLYg9gZjCNcE1tXV2nw0vAYv7oGiiyJyPUE aYU3Lb6vjEVcJnv1gMezQ== X-Received: by 2002:a05:6808:2f17:b0:467:1212:46fe with SMTP id 5614622812f47-467572d39f7mr1851611b6e.38.1773410715906; Fri, 13 Mar 2026 07:05:15 -0700 (PDT) Received: from nathan (162-195-168-172.lightspeed.stlsmo.sbcglobal.net. [162.195.168.172]) by smtp.gmail.com with ESMTPSA id 5614622812f47-467342ef853sm4840234b6e.15.2026.03.13.07.05.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Mar 2026 07:05:14 -0700 (PDT) Date: Fri, 13 Mar 2026 09:05:13 -0500 From: Nathan Bossart To: Nazir Bilal Yavuz Cc: Manni Wood , KAZAR Ayoub , Neil Conway , Andrew Dunstan , Shinya Kato , PostgreSQL-development Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD Message-ID: References: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="3SATdDqw4zXzoVGn" Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --3SATdDqw4zXzoVGn Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Fri, Mar 13, 2026 at 04:34:49PM +0300, Nazir Bilal Yavuz wrote: > On Fri, 13 Mar 2026 at 14:57, Nazir Bilal Yavuz wrote: >> Unfortunately, v15 causes a regression for a 'csv & wide & 1/3' case >> on my end. v14 was taking 8000ms but v15 took ~9100ms. If we add the >> tmp_hit_eof variable then the regression disappears. Also, if I use a >> struct like below, regression disappears again. > >> When I removed the tmp_hit_eof variable on v14, I didn't encounter any >> regression. I really don't understand why this is happening on my end. >> Manni didn't encounter any regression on the benchmark [1]. > > Problem might be related to gcc. I am using Debian Trixie and my > current gcc version is 'gcc version 14.2.0 (Debian 14.2.0-19)'. If I > compile Postgres with 'Debian clang version 19.1.7 (3+b1)', then there > is no regression, which makes more sense IMO. Let's just re-add the temporary variable for hit_eof. The struct idea is clever, but it's just a little more complicated than I think is necessary here. I've also removed the goto in favor of just duplicating the "out" code, like you had before. I'd like to avoid sporadic #ifndef USE_NO_SIMD uses, and goto is out of fashion, anyway. -- nathan --3SATdDqw4zXzoVGn Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=v17-0001-Optimize-COPY-FROM-FORMAT-text-csv-using-SIMD.patch