Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v4BpX-00A8wB-BN for pgsql-hackers@arkaria.postgresql.org; Thu, 02 Oct 2025 05:26:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v4BpU-006CrN-RO for pgsql-hackers@arkaria.postgresql.org; Thu, 02 Oct 2025 05:26:13 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v4BpU-006CpD-HP for pgsql-hackers@lists.postgresql.org; Thu, 02 Oct 2025 05:26:13 +0000 Received: from mail-qt1-x842.google.com ([2607:f8b0:4864:20::842]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v4BpS-000zFe-1l for pgsql-hackers@lists.postgresql.org; Thu, 02 Oct 2025 05:26:12 +0000 Received: by mail-qt1-x842.google.com with SMTP id d75a77b69052e-4db385e046dso6214561cf.0 for ; Wed, 01 Oct 2025 22:26:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1759382770; x=1759987570; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=J8ZZR9PpOdum+AJvnT7HNaKHrIZKxp04+OXBFNlyTRg=; b=eOuWPd7zoNqrZ8bOIRZHPA8fRRy8tDrXvO7quAaonzNBkYTOHjUHKtOuY2xzC9sN/0 l8Sp3cEJFeI8bqiMLa5IJaEocP9C3poyHxGjWNhE1yghu6WZ504j/TbG/UPUUoR47g0e zeHiqq/b0ReNON06SnYa0d0NutV1t92DV9YSTasoMVWY3ixY0v5oQvfmw5G6Wrv0k85L noGdwAfpUO7Yy0ERDBz48Us0JcAPEnStpfgTvy6me7J2MsKepQqeVEjSMgRon21muNKh +eHMnpKQWzfjRBplnVS0cKtHWa23JFEEEZmqqWhu9BlXLWpYZ3ViJ6Az9arf9lWlxlQe 5K7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759382770; x=1759987570; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J8ZZR9PpOdum+AJvnT7HNaKHrIZKxp04+OXBFNlyTRg=; b=sgH9DCV+kWmdSvtpwYokPJWVTh6FJ6XHrKnNE6BriYuZIQ1HSuCuQH6+PaPtwOuoLL z6KTQ7NajnP0ldS7HEonwLzrYX3nt6Qkh34ooGIwfaJHmS01FfMQm465LAyq20fJbtxv v4lqYK9BYnSoZKSl7qrbfkvNR4PrXpgHdJXnvSrRCBlln5OEnVGJf3Bi4+VWL9aqd/qB hqWwlIGPAM0jpQc1cc4JrHFW6j7mDlce2kkn/zQr0MtxN4i7yzxWYi4Plao0rXMWC2Nu QD+qwEultme6OodCn/fXq5UYBnTQ33DckOnWPbT3al4zD3lSa6cYnAUNYaGElkMaYvhU mwmw== X-Gm-Message-State: AOJu0YzbdMfmoRM7qmGeWlyd4u+uHFHh/fhIUPA2sXsto3tC/t5a71Kw reN6qoKKWYJJxphHKiqNrPztXOlw7gQzmyu8zuMC+6RYUkkRkb3NWK6xG0tNIOj2BKofSj1F//v B4zA++TuY1MAQwYNjfvZERP8xZ/tYPDQ= X-Gm-Gg: ASbGncsGreCn46giErRSgJaT+/Z62L9SxwNvGJxnQleWn01Cz5c3FuCJk4BVuc9Ci+y UL9X8IRnqWRMq6GLUqce5L1nK0eiU1TCE/n7NWJpFKwSVPIV47wZVsQe3Uxy4ZzN0ffRTOJzPH+ u0g9KP+64b1IiEQrCCT2023apFUt83qWfl8jfu90XTl0QdJUl7zE7oWTvDxC3T/NFGfrNOAlLAG 0gQYvqpm4fmhl6UKrWIDaz5o5wfCa0cgWdH24uMqVEntjvPdPo2Z8aEUX+7B50rJev6gSKpsyZy 7LXHpnrG9K9MtEghnElVEfcmCyOMd0+Jhs7E0Vf6IqsXL/3ktDU= X-Google-Smtp-Source: AGHT+IGMXJKXvPS2T8RZUbnTcGQkc/GNHaTyoE+aPAmIqwe8Ip8TNUiNkcD1xQA4e6SakogWCIfIiCwcwY4/arQYcyA= X-Received: by 2002:a05:622a:5a19:b0:4de:45ff:1de with SMTP id d75a77b69052e-4e41ca1ce07mr82100491cf.21.1759382770148; Wed, 01 Oct 2025 22:26:10 -0700 (PDT) MIME-Version: 1.0 References: <20250911054220.3784-1-root@ip-172-31-36-228.ec2.internal> In-Reply-To: From: John Naylor Date: Thu, 2 Oct 2025 12:25:59 +0700 X-Gm-Features: AS18NWBQW-Io-BK561D3yqI0TanWXe-_eois50hm-L6M149RP2VYiNXdcIET3Xo Message-ID: Subject: Re: Proposal for enabling auto-vectorization for checksum calculations To: Andrew Kim Cc: pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Sep 25, 2025 at 4:50=E2=80=AFAM Andrew Kim = wrote: > > Thanks, John. I see the issue now =E2=80=94 I=E2=80=99ll attach the entir= e patch > series in a single email so it shows up properly in the commitfest and > gets CI coverage. It's still picking up v4, and the archive link doesn't show any further replies. Something must have happened with the email threading, since you weren't on the thread at first. Please create an account and edit the entry to point to a more recent message ID: https://commitfest.postgresql.org/patch/5726/ > Please find attached v6 of the patchset, updated per your feedback. Thanks. (BTW, we discourage top-posting and prefer to cut to size and use inline responses) This is not a complete review, but some architectural thoughts and some things I've noticed. The top of the checksum_impl.h has this: * This file exists for the benefit of external programs that may wish to * check Postgres page checksums. They can #include this to get the code * referenced by storage/checksum.h. (Note: you may need to redefine * Assert() as empty to compile this successfully externally.) It's going to be a bit tricky to preserve this ability while allowing the core server and client programs to dispatch to a specialized implementation, but we should at least try. That means keeping pg_checksum_block() and pg_checksum_page() where they live now. I think a good first refactoring patch would be to move src/backend/storage/checksum.c (which your patch doesn't even touch) to src/port (and src/include/storage/checksum.h to src/include/port) and have all callers use that. With that, I imagine only that checksum.c file would include checksum_impl.h. If that poses a problem, let us know -- we may have to further juggle things. If that works without issue, we can proceed with the specialization. On that, just a few things to note here, although the next patch doesn't need to worry about any of this yet: + #if defined(__has_attribute) && __has_attribute (target) + __attribute__((target("avx2"))) + #endif + static int avx2_test(void) + { + const char buf@<:@sizeof(__m256i)@:>@; + __m256i accum =3D _mm256_loadu_si256((const __m256i *) buf); + accum =3D _mm256_add_epi32(accum, accum); + int result =3D _mm256_extract_epi32(accum, 0); + return (int) result; + }], If we're just testing if the target works, we can just use an empty function, right? +#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \ +uint32 \ +pg_checksum_block_##ISANAME(const PGChecksummablePage *page); + +#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \ +pg_attribute_target(#ISANAME) \ +uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \ I find this hard to read compared to just using the actual name. +avx2_available(void) +{ +#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__) Why guard on __x86_64__? +PG_DEFINE_CHECKSUM_ISA(default) +{ + uint32 sums[N_SUMS], result =3D 0; + uint32 i, j; [...] +#ifdef USE_AVX2_WITH_RUNTIME_CHECK +PG_DEFINE_CHECKSUM_ISA(avx2) +{ + uint32 sums[N_SUMS], result =3D 0; + uint32 i, j; [...] With the single src/port file idea above, these would just do "return pg_checksum_block()" (or pg_checksum_page, whichever makes more sense). + if (avx2_available()) + { + /* optional: patch pointer so next call goes directly */ + pg_checksum_block =3D pg_checksum_block_avx2; + return pg_checksum_block_avx2(page); + } Not sure what your referring to here by "patching" the pointer, but it sounds dangerous. Besides, the cost of indirection is basically zero for multi-kilobyte inputs, so there is not even any motivation to consider doing differently. -- John Naylor Amazon Web Services