Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vNeao-009zsq-2G for pgsql-hackers@arkaria.postgresql.org; Mon, 24 Nov 2025 21:59:30 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vNeal-004tEA-2R for pgsql-hackers@arkaria.postgresql.org; Mon, 24 Nov 2025 21:59:28 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vNeal-004tE2-1J for pgsql-hackers@lists.postgresql.org; Mon, 24 Nov 2025 21:59:27 +0000 Received: from mail-io1-xd32.google.com ([2607:f8b0:4864:20::d32]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vNeaj-001Iy4-2G for pgsql-hackers@postgresql.org; Mon, 24 Nov 2025 21:59:27 +0000 Received: by mail-io1-xd32.google.com with SMTP id ca18e2360f4ac-948fbdbc79fso108633239f.0 for ; Mon, 24 Nov 2025 13:59:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764021563; x=1764626363; darn=postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=PeqYVuT4D4aRkGd91YyOgeU5zHxNwMedpDkJ56VEoXs=; b=NPP27vWNy3cSzraB0Ac8aEuHhjTypnBOuKI/RTZDJCaoboG9LLI1B5ZRMiKqvDFB0z pDJQpPH7KTtL1loJSotJw24tnAlspMORk8eHqe1fAT8/xMzCdWHuNdkaXhei3zyi6qhw /jtHKuQj+GEemt9DQpHhWfs9icDt2ktDmzm36OEiEOf0uZzaSJ/wvTCQbSJ1e1xis+wG kZNJJMtr3Y2xO36mR7wohA5SXJR9fonpYbz6vL+Kpse94DLwyF5/q+TgemGDGNam2nnV Nj6b8jVLy5Jgj0p69Z5BhgYB5XuKv5LBpJfrFk9A1Z+a81aWVVNRyQu1dq6CTtXYkNVT Ij4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764021563; x=1764626363; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PeqYVuT4D4aRkGd91YyOgeU5zHxNwMedpDkJ56VEoXs=; b=uG+bDtUc9lC+n0pXToHBGF6zP+B8h6FYZmz5BbCqcLVnXC+RGoZLVW5KESLDdLt50A 2mssEa3K44C1x/AZjsyELt3Tn+1YQ2ZzZ0U3VCuDqNm+lk896+3pkSoH92HVziRb/eGb JHR/jJlbtfuvQEOmh3VjOz2/kyVMsWZPMy1CsU/XGwr25aHBCfq85ECTSK8oSetP87Qo QkhwsIxenYxa6LSNytXczC0UoTgAYjq/tU6yh4GB2bEZihMLl325BmDWArQOXMwlBwgI 0woRzMHFvIW0GTCfbTuYegikXlHti0teJcPosROqbPiKzjJdnAIQLail4IVTGKduklFE 9DIg== X-Forwarded-Encrypted: i=1; AJvYcCVdagPzzCPqTjoM+5v87Dmkn2K8hwGeGuA7AlN8sWTKY0T4eS8B/kR4rI/iwe7li+1nOAc3aJ4YMv3a6cgr@postgresql.org X-Gm-Message-State: AOJu0Yy1TmppHEb29iiq8CrebWr4GHIzXcsT2J7GCzuuCEK0PSMXwM5E dmc0ukGAqeT4/sSt+vCR6RagpGmxY0yuidXAK1y0uV2zsgin6/Bg7jSs X-Gm-Gg: ASbGncvE1ZH9nDiJ92lLM4v5uZyZ+HCQbjC1fLc9aZ0ejAdCH61JRybetVgk0COO2sA XE4AVk99FcRKmebHokE/EI+qagwYbGOcKoCLKb+rHfJAdaES8xvzRiCU7AskWhqlaTWKocqB8ps rzQYYazSL6c9aywdUW8irwdruvOXnNFEhqjhPo5McQuJohhuD8dmvTgF9Oxn+9u5DLxIPtTZZRB GlomELUG6MXx7B6EkdohX2gR6AfqMclv2siLhaGWPlJAWLELw1SkSLj4qQ5y/BMdljSS+aZJYxm Qgfa5V0RnDctKrOgzIiaX5imfNgu0UemAArM0iGeWMkBk/6O/nyG6JvKsbRSDIdsC39rJb4VRz4 BJymqChWc7s2HsMAbtP3iUNhmWOliZ+zA0VYl06pY3AqXAFrklEF6itJXw+tl0IpVwOCVP1sAN4 SPw//oBZleHMTn+TPtsxGSQ8obGeOAOnJY1cJtfJ4sVsAM7I3dBbSPQL36HCtdQ1njrDtH+WDQF eGs X-Google-Smtp-Source: AGHT+IHtX1VqslzP0kiBsavPDPkg7Wg0yFAIqffU2agb20IpSbgvMp4W7U2sGeV4e9QTdwJozd/Low== X-Received: by 2002:a05:6638:22d4:b0:5b7:c2fe:3d87 with SMTP id 8926c6da1cb9f-5b999693328mr592860173.18.1764021563215; Mon, 24 Nov 2025 13:59:23 -0800 (PST) Received: from nathan (162-195-168-172.lightspeed.stlsmo.sbcglobal.net. [162.195.168.172]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-5b954b212basm6026447173.36.2025.11.24.13.59.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Nov 2025 13:59:22 -0800 (PST) Date: Mon, 24 Nov 2025 15:59:21 -0600 From: Nathan Bossart To: Nazir Bilal Yavuz Cc: Andrew Dunstan , Shinya Kato , Manni Wood , KAZAR Ayoub , PostgreSQL-development Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD Message-ID: References: <8e226753-57af-489a-bfbe-caa23dd71286@dunslane.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Nov 20, 2025 at 03:55:43PM +0300, Nazir Bilal Yavuz wrote: > On Thu, 20 Nov 2025 at 00:01, Nathan Bossart wrote: >> + /* Load a chunk of data into a vector register */ >> + vector8_load(&chunk, (const uint8 *) ©_input_buf[input_buf_ptr]); >> >> In other places, processing 2 or 4 vectors of data at a time has proven >> faster. Have you tried that here? > > Sorry, I could not find the related code piece. I only saw the > vector8_load() inside of hex_decode_safe() function and its comment > says: > > /* > * We must process 2 vectors at a time since the output will be half the > * length of the input. > */ > > But this does not mention any speedup from using 2 vectors at a time. > Could you please show the related code? See pg_lfind32(). -- nathan