Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vx9MG-000GyK-1m for pgsql-hackers@arkaria.postgresql.org; Mon, 02 Mar 2026 19:55:12 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vx9ME-002tKf-2U for pgsql-hackers@arkaria.postgresql.org; Mon, 02 Mar 2026 19:55:11 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vx9ME-002tKX-1O for pgsql-hackers@lists.postgresql.org; Mon, 02 Mar 2026 19:55:10 +0000 Received: from mail-ot1-x331.google.com ([2607:f8b0:4864:20::331]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vx9MD-000000004EF-0DJM for pgsql-hackers@postgresql.org; Mon, 02 Mar 2026 19:55:10 +0000 Received: by mail-ot1-x331.google.com with SMTP id 46e09a7af769-7d513bc15c7so6700529a34.1 for ; Mon, 02 Mar 2026 11:55:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772481308; x=1773086108; darn=postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=XNOwAsdJZftyksLP5AeXBox3T3iopp8h92sH1YZu3N4=; b=JLoy1bDZsK2OsBNoHXR2W43NsekczmRp1qvU6yrq2KGIGZuxeyUKatJjsgMoHIXyI4 qcfbJJBkLkQz06/13JaUEYybOjht3jhadRgIXEGsNKarzhouvzkUr+h4T5tyEwTa67os 16LtXXnfgxp7oR9QERZa6tr9JStv6hSTkyYx8VBn3ZMWUc1YjZvW6tnlM1vaFHn5LtFy 7KgoZXbOZ6V6RtakbUJ+NxDtaaDNkm1jSbrimBJCzzdV/ElX0EW7umwWGrUXXf2x0kXA KShU00okU+2MkQUaLErSV3QdjBDgRlh+tI8xZL4YHXWQB+yUDqYGg5i6nIp0fuZk+Rzl CSbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772481308; x=1773086108; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XNOwAsdJZftyksLP5AeXBox3T3iopp8h92sH1YZu3N4=; b=AH7XTLAy+WAXv5xuoU3Ejj54CohUfk62JI9FvJOB6N/ObXy6kOy3H+a1lveoXN9L08 BnjMWYo8BR7NxgI2+zsB5l77gyFRerTxhjMZIqGQsBuksppfdD4o18i3abh7SGhguZO8 kWco/R3s1p1N31+VjOjH+nhJmH0c4p/XoELVTgjWvKckXt/K0RNJ0dkkeECDLOsdXEO3 iyjHCncDfRQEEXSIO7XCxJI1ZLH7a3IqERn1CpaLfyPemrlTqa0uFVATofF1xguLPy1I TuULxKNG1LMG5tP/HyQXLSvljR+OseBRw/VtMjcLudqXSm90OPG/UbprSgjOiJmYoDDZ S+BQ== X-Forwarded-Encrypted: i=1; AJvYcCUeMhzG6vpSQgLAH5KO2oU/newMuWHR/z/b8D0wuXlRAXhA8N0fwCpu6k4Ucngn/9kiiwcQLCvQfgst8yBI@postgresql.org X-Gm-Message-State: AOJu0YzvBIyYX6J/2ZR+WrzcumuA0XPbBKV1wm5oLwB9mHbwzg6orx8H 9ZIylXKAMSPW2b7ZXzlKekdYCgFSQCwx5IJE7NVUcoGhpFgdme/+8Q2x X-Gm-Gg: ATEYQzx8Nin9ZHKQ4mj1Ez9HqnfmH4o3sERkzbZVKmqAcahZ7SuW+y3/KHPs5ShI9+L bpkmcYlgcAG9RA6VagWmnjNdKY+C2JT93hXjb1qwfbVudSugC7CgnFyLL4b9qo49IGVf8lO+VkE qW2WDukMbJM2Gz6TSS9y2eGcIS81TmdLbJlMo3zTbjK9GY5E5ZWXZsvXoHwfUz5hYSCJ0wq+DF0 qvA5bezOGzyUxW9DN11tAAxV424jkGn8fSgOMdrzQsWSqsNs+G/i5nMd6gK3mrKHFD4gIe35no1 uSHGayPTDG1r33KRMQRDiHOaa+nd4yo/uZUPFJyPC4runtFyIKouI+0zw+Wqf6U2q9zpNlGSlbM UNp70PJ6B/L+FQIHuMnGvX/Q0DkwBguS1zXFBGyohCn0jglci8myU9OIVXGbXXFvbn/F5W/swCE XClkMR0GHDMG9ITHe6cDPOKVxc8eN9lHphwFwTDl5hhXQ7dEJHCM8LNePjDbV6XrsF0nurdu+CN 9rqovFlnlJGKYZOw+aP3g== X-Received: by 2002:a05:6830:6abb:b0:7cf:d295:c345 with SMTP id 46e09a7af769-7d591c1e8f4mr7366875a34.25.1772481307841; Mon, 02 Mar 2026 11:55:07 -0800 (PST) Received: from nathan (162-195-168-172.lightspeed.stlsmo.sbcglobal.net. [162.195.168.172]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7d5866269a8sm11596321a34.13.2026.03.02.11.55.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Mar 2026 11:55:07 -0800 (PST) Date: Mon, 2 Mar 2026 13:55:05 -0600 From: Nathan Bossart To: Nazir Bilal Yavuz Cc: Manni Wood , KAZAR Ayoub , Neil Conway , Andrew Dunstan , Shinya Kato , PostgreSQL-development Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Feb 25, 2026 at 05:24:27PM +0300, Nazir Bilal Yavuz wrote: > If anyone has any suggestions/ideas, please let me know! A couple of random ideas: * Additional inlining for callers. I looked around a little bit and didn't see any great candidates, so I don't have much faith in this, but maybe you'll see something I don't. * Disable SIMD if we are consistently getting small rows. That won't help your "wide & CSV 1/3" case in all likelihood, but perhaps it'll help with the regression for narrow rows described elsewhere. * Surround the variable initializations with "if (simd_enabled)". Presumably compilers are smart enough to remove those in the non-SIMD paths already, but it could be worth a try. * Add simd_enabled function parameter to CopyReadLine(), NextCopyFromRawFieldsInternal(), and CopyFromTextLikeOneRow(), and do the bool literal trick in CopyFrom{Text,CSV}OneRow(). That could encourage the compiler to do some additional optimizations to reduce branching. -- nathan