Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vCCsK-00AV0r-FL for pgsql-hackers@arkaria.postgresql.org; Fri, 24 Oct 2025 08:10:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vCCsI-00EqIM-Uy for pgsql-hackers@arkaria.postgresql.org; Fri, 24 Oct 2025 08:10:13 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vCCsI-00EqID-Ft for pgsql-hackers@lists.postgresql.org; Fri, 24 Oct 2025 08:10:13 +0000 Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vCCsF-003VBq-1B for pgsql-hackers@lists.postgresql.org; Fri, 24 Oct 2025 08:10:12 +0000 Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-63becaea1cdso294322a12.2 for ; Fri, 24 Oct 2025 01:10:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761293410; x=1761898210; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=rQBkDTGxlm2UT+vj+g0b96NklrD6h6O4iTLLK96Qs8Y=; b=hrhlkh74ACLyNY+Tafw9NGjSS35RGcTwLroUllA20tEaCfhOzJWWoacvsX7AyiQMsN W0Rbrv3defG8kvajuF7IRNN6QZcHzqKnlL93Vvfag19QToxIqyyWw2K5bSIHOncrGy89 R/5FHM/bIjTJ2yA/PQmkqOZ1iVjW6DQHLMQ6mG84aKLjgG26gFF7pFU+Ewo7Vo5PKbcj tJ545KvgL1idWkH253aPx/L6jGGwuen6LaLr630aMZJTk//MtWcPBDY95ZJRyg9Y1jmA dFyAyLs/LNCmRJjIe20caVstrS4tTEQY2RlVWzKwgn7S4vUzcfyIBTBPDq3x8Z6F885J 94KA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761293410; x=1761898210; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rQBkDTGxlm2UT+vj+g0b96NklrD6h6O4iTLLK96Qs8Y=; b=rxi4QyaalWuO5pRUaTxI5bO9pH6xJtnEjfc8s5Kh0XlUsnWcGSt72NSSqUkXx89Vty YnjaKDdpDN1CPi4F/T3Y5dEk5gY0ds6UJFA8beC3Juaps0VWRp/9S9+/ppaNeIxxJaL9 UO1iskr92DqSVW6LTtfi3IgrucrlSHrbEHYrhybmSiiOfxIaDL0YfYPmU4kf9YTVPmMx zsVt7H+MPcS+1QfkA8ljv5blHgXgxa6nyfOm0QpKry4PewjxPZGEip/5D8DYSu0Lr45n pdnH6KspHCQsiLMhrWkwj8sKPgsCFtKxrdf40U02wU7Tceq9cgO3cLmMNvEG/Qp9jwap lLQw== X-Forwarded-Encrypted: i=1; AJvYcCXdWX06mc4wzMPmHfnRqsti4+KahbydxnopOM7xqdz0+b8dBAk/unL6A5Ug9iMS+niDE3zT7Vfe+Fun3pt1@lists.postgresql.org X-Gm-Message-State: AOJu0YwS8hifzfRGHemJg0n4AXgTdPxefpIYcowIybiD60bvboSbhi9v Ce3VYl1qHU5ucSmgpUGRb4swNJ1xmTh+tMU7k/9OmQokznC8gvbuWC2Pesznh3LU9vK3Bz0LOg4 rUavX2M9tuWoStxii9zznTbTMuw1Vwpk= X-Gm-Gg: ASbGncvSr/N5QLFuuC0EPCdx4SGkweYaP+BKj8VpJdzr7m2NIm6fVYKllhj0VrfYoD8 YZHobEgi6QTJqIYjtV8J31wTmOv83VFTkrBT2Te3SsS7Cu68GzMspF8RBgi1kkTY+GrBtjUg8m1 qh3JsFXAt1XbqjLijFHXhnwGR/laMg9HOSitql7mBHYQfe7peK38w5SY9T7vJt4Wdmfyian0hUM ABXXK0AfizyeR6GPK9dAZjUdvLq4E145DZy6E6Yl20mf6zCzYEwdFiFBDhswm31gE3+/A== X-Google-Smtp-Source: AGHT+IHW9T+Zxqnxq+R0muHTqkHPIqHQ83HdfQJ2v0ooMPN2g2HXkj4dCXiTPyAHUPWrewtpG3ELzH2XX8m9U9VMkCk= X-Received: by 2002:a05:6402:4307:b0:63c:1d4a:afcb with SMTP id 4fb4d7f45d1cf-63d09d3bd06mr7094719a12.0.1761293409299; Fri, 24 Oct 2025 01:10:09 -0700 (PDT) MIME-Version: 1.0 References: <20250911054220.3784-1-root@ip-172-31-36-228.ec2.internal> <44d2472a978912ca352eb839a24b2176@postgrespro.ru> In-Reply-To: <44d2472a978912ca352eb839a24b2176@postgrespro.ru> From: Andrew Kim Date: Fri, 24 Oct 2025 01:09:56 -0700 X-Gm-Features: AWmQ_bn1073e_9eWa2X8rPTd-gwpavztmMmWxFgobrKT0eNo4gyBm1KTPDTt6e0 Message-ID: Subject: Re: Proposal for enabling auto-vectorization for checksum calculations To: Oleg Tselebrovskiy Cc: John Naylor , pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Oleg, Thank you for the detailed review on v7 patch. On Mon, Oct 20, 2025 at 8:05=E2=80=AFAM Oleg Tselebrovskiy wrote: > > Thanks for the new patch version! > > Another round of review: > > 1) I think that changes to contrib/pageinspect/rawpage.c should be in > the main patch, not in the benchmark patch. Also, without those chages > the main patch can't compile using make world-bin > This is already correctly handled in v8. The contrib/pageinspect/rawpage.c change is in the main patch (v8-0001), not in the benchmark patch. The include statement was updated from #include "storage/checksum.h" to #include "port/checksum.h" in the refactoring patch, which is the correct placement. > 2) I still don't get why you check for working intrinsics in configure, > config/c-compiler.m4 and meson.build, if your patch later uses them. > I've gotten correct assembly code with this avx2_test function: > #include > #if defined(__has_attribute) && __has_attribute (target) > __attribute__((target("avx2"))) > static int avx2_test(void) > { > return 0; > } > #endif > Please, check if this works for you and consider using something similar > I agree. In v8, I've simplified the configure tests significantly. The config/c-compiler.m4 now uses exactly the pattern you suggested: > 3) __builtin_cpu_supports doesn't work on Windows at all. We still have > to use approach with __get_cpuid > Completely fixed in v8. I've removed all usage of __builtin_cpu_supports and implemented proper cross-platform CPU detection using __get_cpuid (Linux/GCC) and __cpuid (Windows/MSVC) with proper preprocessor guards > 4) Looks like you can safely remove "port/checksum_impl.h" from > src/bin/pg_checksums/pg_checksums.c. It probably links with libpgport > and/or libpgcommon, so it gets pg_checksum_page from there. Same with > src/bin/pg_upgrade/file.c. Maybe those includes are "for clarity" and > you don't need to remove them, but pg_checksums and pg_upgrade seem to > work without them > In v8-0001, both files now only include "port/checksum.h". The direct inclusion of checksum_impl.h has been removed: src/bin/pg_checksums/pg_checksums.c: Only includes "port/checksum.h" src/bin/pg_upgrade/file.c: Only includes "port/checksum.h" > 5) You don't need #include /* for memcpy */ in > checksum_impl.h. At the very least, memcpy was used before your patch > without string.h > Confirmed there's no explicit #include in the v8 checksum_impl.h. The memcpy usage relies on the standard PostgreSQL includes. > 6) Why did you remove Assert(sizeof(PGChecksummablePage) =3D=3D BLCKSZ)? = Is > it always false? > I didn't remove it - it's still present in both implementations in v8. In both pg_checksum_block_default and pg_checksum_block_avx2, you can see: /* ensure that the size is compatible with the algorithm */ Assert(sizeof(PGChecksummablePage) =3D=3D BLCKSZ); > 7) Is reformatted variable declaration in pg_checksum_block_default_impl > really needed? Is there a good reason for it? Or is it auto-formatting > programm output? > Sorry, that's my mistake, In v8, I've kept the variable declarations consistent with PostgreSQL style without unnecessary reformatting. The declarations in both functions follow the same pattern as the original code. > 8) Your patch removes one whitespace in this line - for (i =3D 0; i < > (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++) > If you wish to fix formatting like that - please, do it in a separate > patch. If this was done automatically by some formatting tool - please, > revert this change > In v8, I've preserved the original formatting. The for loop maintains the original spacing: for (i =3D 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++) with proper space after (uint32). > 9) Unneeded empty string > #define FNV_PRIME 16777619 > > + > /* Use a union so that this code is valid under strict aliasing */ > typedef union > Fixed, removed unnecessary blank lines in v8. > 10) You need one line with just /* at the beginning of the comment, look > at other multiline comments in this file > + /* For now, AVX2 implementation is identical to default > + * The compiler will auto-vectorize this with proper flags > + * Future versions could use explicit AVX2 intrinsics here > */ > Fixed, it's started style with the opening /* on its own line: /* * AVX2-optimized block checksum algorithm. * Same algorithm as default, but compiled with AVX2 target for auto-vectorization. */ > 11) Function pg_checksum_block_simple isn't used at all. > There's no pg_checksum_block_simple function in v8. The implementation only has the necessary functions: pg_checksum_block_default, pg_checksum_block_avx2, and pg_checksum_block_choose. > 12) Why do you need those? > +#ifndef PG_CHECKSUM_EXTERNAL_INTERFACE > +#define PG_CHECKSUM_EXTERNAL_INTERFACE > These macros are not present in v8. The implementation is cleaner without unnecessary preprocessor guards. > 13) Object files are added according to alphabetical order, not logical > order (src/port/Makefile) > pg_popcount_aarch64.o \ > pg_popcount_avx512.o \ > + checksum.o \ > pg_strong_random.o \ > pgcheckdir.o \ > In v8, checksum.o is correctly placed in alphabetical order in the OBJS list in src/port/Makefile: OBJS =3D \ $(LIBOBJS) \ $(PG_CRC32C_OBJS) \ bsearch_arg.o \ checksum.o \ chklocale.o \ > 14) I still think that src/port/checksum.c needs to just include > src/include/port/checksum_impl.h and have no other logic to keep > checksum_impl.h's role as "header with full implementation" > Now checksum_impl.h doesn't have any mention of pg_checksum_page > The current v8 approach has checksum.c simply include checksum_impl.h, which maintains the "header with full implementation" pattern you prefer. However, the function pointer mechanism and runtime detection logic is in checksum_impl.h, which means pg_checksum_page (the external interface) is also defined there. This keeps the external interface clean while maintaining the implementation details in the header. > 15) Assembly for pg_checksum_block_choose now has full code of > pg_checksum_block_default. This is probably a result of using inline > functions > Don't know if this is bad, but it is at least strange > I think that is expected behavior with the function pointer approach. The compiler inlines the first call, but subsequent calls use the cached function pointer, which is the standard PostgreSQL pattern for runtime CPU feature detection (see CRC32C implementation). > Also, some CFBot checks have failed. Two of them with this error/warning > checksum.c:88:1: error: no previous prototype for =E2=80=98pg_checksum= _page=E2=80=99 > [-Werror=3Dmissing-prototypes] > 88 | pg_checksum_page(char *page, BlockNumber blkno) > | ^~~~~~~~~~~~~~~~ > Please, address those > In v8, pg_checksum_page is declared in src/include/port/checksum.h, which is included by checksum.c. This should resolve the missing prototype error. > Oleg Tselebrovskiy, PostgresPro