Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7bza-005We9-0A for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 16:31:02 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7bzY-00BOpc-1c for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 16:31:00 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7bzY-00BOpT-0g for pgsql-hackers@lists.postgresql.org; Tue, 31 Mar 2026 16:31:00 +0000 Received: from mail-oi1-x229.google.com ([2607:f8b0:4864:20::229]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w7bzW-00000002Cc0-0fAp for pgsql-hackers@postgresql.org; Tue, 31 Mar 2026 16:31:00 +0000 Received: by mail-oi1-x229.google.com with SMTP id 5614622812f47-4670bcc40d7so1989675b6e.2 for ; Tue, 31 Mar 2026 09:30:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774974657; x=1775579457; darn=postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pkQs3qqsMzxuzZ4Dhi33DB08eav/eC7/I4/ZCYcLiv0=; b=MC4yMpBeNY+0YIOskFWCxFC5u+H0j9DAX/RXuutnEMjWNohSeJt8EfUcs4eDUKGrYx oR3mftIt4iPxuW9YfHnaNNzEgDrcsbSpeJLH76XYPjXlLKHiUKWtxiGSVWPFunnrxiTT uhpVJzkmQmBp6hVlxp1mdgGrRO4yc5SBeAvsUuGKzVEFTvc4XXTDGNsdJyn0X39z682c rGooetcNawZ1kq7Kx0aJM/1vx4rtodPwxUFwhu26qrGHSRwR6VAxv4bsa63ijN8fsCby NlSnQ5FIarUOFjveqmz4s1uN6eTkl70d0cgvrR1V+dZ+HngOYhdwhBZLI7b5eVZrTtqk TRSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774974657; x=1775579457; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pkQs3qqsMzxuzZ4Dhi33DB08eav/eC7/I4/ZCYcLiv0=; b=ARpc58G6PJU4rnL956qsgB0ZDXNxxnXufl0T2mw5t+bIfHD0faDXBgT63KLhtLc2wv XR2sJdmrWvxaAzlcZLVLN6DngRUUZNI+XMPHEbPXhTVEuqvDFrplpsWd4kZUd2Y9Sp04 1rfTsqUCwHthQ4YycgAVsURvDHeUgbZ0L6o5MwyGX8sb50qJE1y/3x8skCMSNbKUF773 BF9Tk5j7g4GZ3Jcd4uqbo3KYLv8Eyq79CsaQ5aStGJkPrHBeiXvcbzQ7wxm+qo9BUVDS tO7jvwMYChz6P2V2EDMrgFE7VMPB8O2mmh02NDSK1vBzOoK1DETq9UwLCg9YXf5OQsK/ 6Ikg== X-Forwarded-Encrypted: i=1; AJvYcCVga7xJh2EAp8/tbjJOhMkfuy4kMYZ/AXZ+y/utrXS2/gS0LRicWZpwzpa0pT1FvU10tzCltLqwmsbJ41FB@postgresql.org X-Gm-Message-State: AOJu0Ywsw73niFt/l1NUNvTNnRykswwngn7WvAxORwKt11FVLWW5Uzlt bhGje8aPtFwcaWzYGFn4wxrY5e06rOXJcrTs2P2EmaNIhmoAq1U0TnpVIGglVg== X-Gm-Gg: ATEYQzxBV+Zk2a+v6wYWiu1TGhV7BX17L72RhGBFpqfH8wcxNz4xZVZHtu0qN3t2hwX GeB4YTIwfm1uirWM6zoog94qaWlHU6+mN5Jzg38So8BZpMVMk74h1L5X/dd7/Ery1BecsNSBrg+ JalZRjRC2tHmy5uThSxdXlzfPYWq5TyXdBv0rA85Vau8HUA7EIxH89nJ3FWyGSQq5UBxNct9WPu 3AT69rY3P5wTd+axerFcV4AswJrPWUjOnO1KSOGlrEzvExlmN9idTlx0RGLcCy86wvAU7w8o/u+ vGhF2eUj+1dd1sI7c5g9YMBWlQXsb+V08nSMurXsY8POUahe178gc4hzlnnTFITjKHOPEXw0uFs en98PecF4mQjOtPjjZ7JIQk2UUoK6jLnfd03w7CvI0VCeoKtq55UDLEYxlb/b3vpMFXqYjDSlvP KhrEDdDTYiiOeQ6TSTyeRlHzmCJgKI4jd9pT2N9GjPTOKVJs7g1dyYkkrFbxgr0TKBGqaM7TAlO pnlpAdlo7THL2L9bDHPEA== X-Received: by 2002:a05:6808:c40c:b0:467:de0e:fe9a with SMTP id 5614622812f47-46ae010aefdmr13870b6e.25.1774974656882; Tue, 31 Mar 2026 09:30:56 -0700 (PDT) Received: from nathan (162-195-168-172.lightspeed.stlsmo.sbcglobal.net. [162.195.168.172]) by smtp.gmail.com with ESMTPSA id 5614622812f47-46aa036505dsm7118639b6e.10.2026.03.31.09.30.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2026 09:30:56 -0700 (PDT) Date: Tue, 31 Mar 2026 11:30:54 -0500 From: Nathan Bossart To: KAZAR Ayoub Cc: Andres Freund , Pg Hackers , Neil Conway , Manni Wood , Andrew Dunstan , Shinya Kato , Mark Wong , Nazir Bilal Yavuz Subject: Re: Speed up COPY TO text/CSV parsing using SIMD Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Mar 27, 2026 at 07:48:38PM +0100, KAZAR Ayoub wrote: > I added a prescan loop inside the simd helpers trying to catch special > chars in sizeof(Vector8) characters, i measured how good is this at > reducing the overhead of starting simd and exiting at first vector: > the scalar loop is better than SIMD for one vector if it finds a special > character before 6th character, worst case is not a clean vector, where the > scalar loop needs 20 more cycles compared to SIMD. > This helps mitigate the case of JSON(B) in CSV format, this is why I only > added this for CSV case only. Interesting. > In a benchmark with 10M early SIMD exit like the JSONB case, the previous > 3% regression is gone. While these are nice results, I think it's best that we target v20 for this patch so that we have more time to benchmark and explore edge cases. -- nathan