Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w02Zo-001YW4-1N for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Mar 2026 19:17:08 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w02Zk-005PoR-0V for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Mar 2026 19:17:04 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w02Zj-005PoH-2e for pgsql-hackers@lists.postgresql.org; Tue, 10 Mar 2026 19:17:04 +0000 Received: from mail-ot1-x335.google.com ([2607:f8b0:4864:20::335]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w02Zh-000000020zU-399V for pgsql-hackers@postgresql.org; Tue, 10 Mar 2026 19:17:03 +0000 Received: by mail-ot1-x335.google.com with SMTP id 46e09a7af769-7d4be94eeacso12885174a34.2 for ; Tue, 10 Mar 2026 12:17:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773170220; x=1773775020; darn=postgresql.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=oZo2CsODkiUzIrjvUfLsjfOjeMcXLJGw/awWougJvHc=; b=aNLEFLTvvpJUgI38xJHyqasbzNHDbB3LMYspOTo2iDgQXMvrxWhWr5ppYS91Y5I7uP qfSmS24s8lpsELExaPbcJrrJRdYLZUZ9bMQ6EQwzRNQ4WEPyqcqJpeDO3be3JyZpr8r/ ZNmJ4OucPPNX/Kk6H23zEVgKhGWEJ4e0xFVTaBas4EimZiTR0jd/l2d73HFnj5wvPfQj Ka/COX4CaUO83TjqA0+1qkat8bLGf+Lh2cZbmrDh5tcwu67o3W2QGT6J9zBE0KFYNty2 SDzcGvvt+27zgSjfG+MkC7mFnf2Y3HiS+/P6igsoXMfj7Il0seFRji2LSvDMbFxVgxPy p7/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773170220; x=1773775020; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=oZo2CsODkiUzIrjvUfLsjfOjeMcXLJGw/awWougJvHc=; b=FHS2D/p7en5VY9QPNttsBJPlEBXf36ePZmd1/DlYH49orctB8UN8RjnMXd0ayDIW5M 0Vxy08hI7M47YNYPAi6L9oRx96vAxL9ZZsolLP8sVpbumBdr7Um04lJIk1zZNeot6A0U XSVg0T0RHV9rBK/p2v6UhyXHlUEqF+ycCZJNHfiQpROX5YU8B3UOACRnJraHdTFe9Cvd dlj8FDy1f9ZsaZpNEwI8aMn2Vvp07SonsY87qt1vL8lXwqlFY1KleUFjjc9LaO+rBJez Y7v7fW0tHnEIxC9GQRYo79054SV+S5le+rr5Nm8K895SuRldGDgapFsZG1Ipq8IsMGtJ bEKg== X-Forwarded-Encrypted: i=1; AJvYcCVYJvbBePclE+OdseHZgA7HBjrJbiQLeNZIlJxaKkcS5jQSJBJtMcIA68dfxxopk08NNm0j8GiVQ421u+97@postgresql.org X-Gm-Message-State: AOJu0YxsdbpWYedNiYyQJZOyt2xGhMldBbWNkTT+rY/t9SuYVCEvyzPp rZ1oqPix1vxJP5jzmk77tgXVJ76WTH2irKsbWI8cfMrO3dPCRSkw0uDDAQcjfA== X-Gm-Gg: ATEYQzw328d+KDk6dIdzGahmY0rw3HraAtXTYgdeqac4ecv6MfotM14B9heP6Hw+AxT 6TsOwS24++6KhPNypJ8TpLDfXo/a7f8wM+xCF+gzCthC380KrjA3NHIdMSHY2i1pkuDgW9cYOR0 TuRZWypdgkT2zAdwwgZNyA/zZiS4frAF/LGuIoedsvEWQxzN0W7jLhWDXCbmztvhHVv+motTYre t+17H5NpxNEFMXOBOizYHmJTRWMZAtn3AyXxlenGO7ztCJh43x0qcXWGLKt8yeDJ9EZX0Y3DVQP s7bLu7DAQQsUh4lFLDmaTox70RDkCsdMtIuFi7EYJ9my2scxGzODuYOFOuparG8QDGP4CIwMh3S mqlCyUWGmGJRMH05dsridMNCC1Trzw54oV0Ps69MwKlGKur05DNzuQMshjBL5elxE1n+C/Nz4pe P2UsBfvqjtWHvxiYKgljEMtFV/Q1rgepNRG/NkPBVIH+0iKqi+xBMFhpvvtiCGNpZ3U7+vE+vof tilhgLxaj4va5KWTnbZ+A== X-Received: by 2002:a05:6830:34a5:b0:7d7:4721:9a3f with SMTP id 46e09a7af769-7d76a81cd55mr2780a34.33.1773170219839; Tue, 10 Mar 2026 12:16:59 -0700 (PDT) Received: from nathan (162-195-168-172.lightspeed.stlsmo.sbcglobal.net. [162.195.168.172]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7d75ed04236sm1608443a34.0.2026.03.10.12.16.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Mar 2026 12:16:59 -0700 (PDT) Date: Tue, 10 Mar 2026 14:16:57 -0500 From: Nathan Bossart To: KAZAR Ayoub Cc: Andres Freund , Pg Hackers , Neil Conway , Manni Wood , Andrew Dunstan , Shinya Kato , Mark Wong , Nazir Bilal Yavuz Subject: Re: Speed up COPY TO text/CSV parsing using SIMD Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Feb 14, 2026 at 04:02:21PM +0100, KAZAR Ayoub wrote: > On Thu, Feb 12, 2026 at 10:25 PM Andres Freund wrote: >> I have a hard time believing that adding a strlen() to the handling of a >> short column won't be a measurable overhead with lots of short attributes. >> Particularly because the patch afaict will call it repeatedly if there are >> any to-be-escaped characters. > > [...] > > 1000 columns: > TEXT: 17% regression > CSV: 3.4% regression > > 500 columns: > TEXT: 17.7% regression > CSV: 3.1% regression > > 100 columns: > TEXT: 17.3% regression > CSV: 3% regression > > A bit unstable results, but yeah the overhead for worse cases like this is > really significant, I can't argue whether this is worth it or not, so > thoughts on this ? I seriously doubt we'd commit something that produces a 17% regression here. Perhaps we should skip the SIMD paths whenever transcoding is required. -- nathan