Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uwbFz-0058b2-2T for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Sep 2025 06:58:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uwbFw-006F0Y-SQ for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Sep 2025 06:58:09 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uwaSE-005h7k-So for pgsql-hackers@lists.postgresql.org; Thu, 11 Sep 2025 06:06:47 +0000 Received: from mail-qt1-x836.google.com ([2607:f8b0:4864:20::836]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uwaSC-001nAA-1k for pgsql-hackers@lists.postgresql.org; Thu, 11 Sep 2025 06:06:46 +0000 Received: by mail-qt1-x836.google.com with SMTP id d75a77b69052e-4b494e774bfso1284971cf.3 for ; Wed, 10 Sep 2025 23:06:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1757570803; x=1758175603; darn=lists.postgresql.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eF0KGMLbFDI/RHDBKpTE4pqbyzteFI25CiB9pLtsFJs=; b=AGzNyS491/7TSjL8TxkB9DTewTPGPHGAjgCFIze9o5bV7DeZTelp4fv2MKDL4r8jgI Xg8PUF42Mv5ocv2HJ+lOkoUgnrIpzpFKnz9ASFDUkLBMORaD/21OXl0MStLn/YETxItR xidJyhPf2t/U4OtC6yLq33OJmQlLyypvzS67+1CJiHtvyeADltkqJ3RlbLSzwD9WgM8h vG0eSYxIh7zCNZO1GgmDFY/jTwSTyW64NwbFk81d7JN0k7DLKnWx+btseK+jv5gGz3wI khoFw2z0M7yQB1kRy/cXMIIYJRM5+76IT9MJR22Xfojns7GWnT4xKZp5wtboeFIBmZNQ 1S9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757570803; x=1758175603; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eF0KGMLbFDI/RHDBKpTE4pqbyzteFI25CiB9pLtsFJs=; b=g0hA4CwjZibo0gEHAyHgi84SG2vPKvSagSRhOABf+UTYfmimUuwOAnUNZ3v7QL1QgJ rWqB2yeWcYKlS0NQfymz2UQg/WQBSVLSL5ksikXeBezpMnDr3GFDUEMwyO8eDAEbvQXZ pm2rvSgHc3UKRAwauXMHnNQYacex7ekmX9PwNYhRmxg/K+qK6tdA70SSsccP/Fe8NsF4 xNlQEzyYpNKdQGdsX6vhXIGeCbMNapENSMLmIhOMEhH5bawZgu+Voy7wvoYjSQ1sClTw o0loHMOgWtyQzqL34a8O0VoKQRXclLwhsIsBh8DkUPGGSMwcURW2XbzcMBJU3Y53uJMf 6DRQ== X-Gm-Message-State: AOJu0Yy/LLJ+rm7lpOGwQQ90tcJD8x3ChfYWYyITkPWwB6vNaAW9l1ty a2OEUPYNdw/barHnHRAgz8Ejj+hyQePH10JaEsvU1T9TGdORh3d6bI/8woW08oer X-Gm-Gg: ASbGnctWakUYxstEaV8GN/OjFzsJ8IrIff8yb3CBymVIHC/3Frxl4NkMgnEXhz23M0U vGPNmwiYkP9eSLz3FLMn3viRrehSVvTXH0fRle/GluxEoxlIFHgeRjCwZMn1Q2t6yMVlTA/lmRU F+soNzjU9iy5Fu1QgVxS7s1PTQzl9qn+IPLsxMO9KDA0mbrg7HJi+jDDqw8UHD5lCejDhi2OlfK a/iogGP0hEgxA49hfkX2wrBVRyrt8omzraW7tzHkt+7hf9i/i81NlsXwXGVhxITAecXA0wUb8SF mr69FVcBjCvKmrFb3PFgJg9+E0KfxGV2kFKnA+/2vJtiL4wKRM0x17NvzBcodw2c8e5Ws5p1ls6 GVF9LhgZdGaKBZwsEPWz2O1d7NQUhNQZz3fkTcACtXd8Zrzr4OyPK+fwJ/HcT/QC5Fgn85Qorvm nkYr6j3AcARg0x X-Google-Smtp-Source: AGHT+IFjGEjdks83T2Qw+LhAR/3lGeRrLoVr0QkZa38syU+MCuPUCh1Ifc4KLA4tkHIuZGvbjNentA== X-Received: by 2002:ad4:5b87:0:b0:763:83c3:5974 with SMTP id 6a1803df08f44-76383d2ab53mr8238716d6.5.1757570803114; Wed, 10 Sep 2025 23:06:43 -0700 (PDT) Received: from localhost (ec2-54-224-155-122.compute-1.amazonaws.com. [54.224.155.122]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-763bfedb916sm5402566d6.57.2025.09.10.23.06.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Sep 2025 23:06:42 -0700 (PDT) From: tenistarkim@gmail.com X-Google-Original-From: andrew.kim@intel.com To: pgsql-hackers@lists.postgresql.org Cc: alvherre@postgresql.org, andres@anarazel.de, Andrew Kim Subject: [PATCH 1/2] Enable autovectorizing pg_checksum_block Date: Thu, 11 Sep 2025 06:06:27 +0000 Message-ID: <20250911060628.3950-2-andrew.kim@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250911060628.3950-1-andrew.kim@intel.com> References: <20250911060628.3950-1-andrew.kim@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk From: Andrew Kim --- config/c-compiler.m4 | 31 +++++ configure | 52 +++++++++ configure.ac | 9 ++ meson.build | 28 +++++ src/include/pg_config.h.in | 3 + src/include/storage/checksum_impl.h | 90 +++----------- src/port/Makefile | 1 + src/port/meson.build | 1 + src/port/pg_checksum_dispatch.c | 174 ++++++++++++++++++++++++++++ 9 files changed, 318 insertions(+), 71 deletions(-) create mode 100644 src/port/pg_checksum_dispatch.c diff --git a/config/c-compiler.m4 b/config/c-compiler.m4 index da40bd6a647..5eb3218deb5 100644 --- a/config/c-compiler.m4 +++ b/config/c-compiler.m4 @@ -711,6 +711,37 @@ fi undefine([Ac_cachevar])dnl ])# PGAC_XSAVE_INTRINSICS +# PGAC_AVX2_SUPPORT +# ----------------------------- +# Check if the compiler supports AVX2 in attribute((target)) +# and using AVX2 intrinsics in those functions +# +# If the intrinsics are supported, sets pgac_avx2_support. +AC_DEFUN([PGAC_AVX2_SUPPORT], +[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx2_support])])dnl +AC_CACHE_CHECK([for AVX2 support], [Ac_cachevar], +[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include + #include + #if defined(__has_attribute) && __has_attribute (target) + __attribute__((target("avx2"))) + #endif + static int avx2_test(void) + { + const char buf@<:@sizeof(__m256i)@:>@; + __m256i accum = _mm256_loadu_si256((const __m256i *) buf); + accum = _mm256_add_epi32(accum, accum); + int result = _mm256_extract_epi32(accum, 0); + return (int) result; + }], + [return avx2_test();])], + [Ac_cachevar=yes], + [Ac_cachevar=no])]) +if test x"$Ac_cachevar" = x"yes"; then + pgac_avx2_support=yes +fi +undefine([Ac_cachevar])dnl +])# PGAC_AVX2_SUPPORT + # PGAC_AVX512_POPCNT_INTRINSICS # ----------------------------- # Check if the compiler supports the AVX-512 popcount instructions using the diff --git a/configure b/configure index 39c68161cec..54da05ac0db 100755 --- a/configure +++ b/configure @@ -17608,6 +17608,58 @@ $as_echo "#define HAVE_XSAVE_INTRINSICS 1" >>confdefs.h fi +# Check for AVX2 target and intrinsic support +# +if test x"$host_cpu" = x"x86_64"; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for AVX2 support" >&5 +$as_echo_n "checking for AVX2 support... " >&6; } +if ${pgac_cv_avx2_support+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#include + #include + #if defined(__has_attribute) && __has_attribute (target) + __attribute__((target("avx2"))) + #endif + static int avx2_test(void) + { + const char buf[sizeof(__m256i)]; + __m256i accum = _mm256_loadu_si256((const __m256i *) buf); + accum = _mm256_add_epi32(accum, accum); + int result = _mm256_extract_epi32(accum, 0); + return (int) result; + } +int +main () +{ +return avx2_test(); + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + pgac_cv_avx2_support=yes +else + pgac_cv_avx2_support=no +fi +rm -f core conftest.err conftest.$ac_objext \ + conftest$ac_exeext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx2_support" >&5 +$as_echo "$pgac_cv_avx2_support" >&6; } +if test x"$pgac_cv_avx2_support" = x"yes"; then + pgac_avx2_support=yes +fi + + if test x"$pgac_avx2_support" = x"yes"; then + +$as_echo "#define USE_AVX2_WITH_RUNTIME_CHECK 1" >>confdefs.h + + fi +fi + # Check for AVX-512 popcount intrinsics # if test x"$host_cpu" = x"x86_64"; then diff --git a/configure.ac b/configure.ac index 066e3976c0a..2c484a12671 100644 --- a/configure.ac +++ b/configure.ac @@ -2118,6 +2118,15 @@ if test x"$pgac_xsave_intrinsics" = x"yes"; then AC_DEFINE(HAVE_XSAVE_INTRINSICS, 1, [Define to 1 if you have XSAVE intrinsics.]) fi +# Check for AVX2 target and intrinsic support +# +if test x"$host_cpu" = x"x86_64"; then + PGAC_AVX2_SUPPORT() + if test x"$pgac_avx2_support" = x"yes"; then + AC_DEFINE(USE_AVX2_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX2 instructions with a runtime check.]) + fi +fi + # Check for AVX-512 popcount intrinsics # if test x"$host_cpu" = x"x86_64"; then diff --git a/meson.build b/meson.build index ab8101d67b2..ff42c41ca7e 100644 --- a/meson.build +++ b/meson.build @@ -2289,6 +2289,34 @@ int main(void) endif +############################################################### +# Check for the availability of AVX2 support +############################################################### + +if host_cpu == 'x86_64' + + prog = ''' +#include +#include +#if defined(__has_attribute) && __has_attribute (target) +__attribute__((target("avx2"))) +#endif +int main(void) +{ + const char buf[sizeof(__m256i)]; + __m256i accum = _mm256_loadu_si256((const __m256i *) buf); + accum = _mm256_add_epi32(accum, accum); + int result = _mm256_extract_epi32(accum, 0); + return (int) result; +} +''' + + if cc.links(prog, name: 'AVX2 support', args: test_c_args) + cdata.set('USE_AVX2_WITH_RUNTIME_CHECK', 1) + endif + +endif + ############################################################### # Check for the availability of AVX-512 popcount intrinsics. diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in index c4dc5d72bdb..987f9b5c77c 100644 --- a/src/include/pg_config.h.in +++ b/src/include/pg_config.h.in @@ -675,6 +675,9 @@ /* Define to 1 to use AVX-512 CRC algorithms with a runtime check. */ #undef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK +/* Define to 1 to use AVX2 instructions with a runtime check. */ +#undef USE_AVX2_WITH_RUNTIME_CHECK + /* Define to 1 to use AVX-512 popcount instructions with a runtime check. */ #undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h index da87d61ba52..82e525529f4 100644 --- a/src/include/storage/checksum_impl.h +++ b/src/include/storage/checksum_impl.h @@ -101,12 +101,14 @@ */ #include "storage/bufpage.h" +#include "pg_config.h" /* number of checksums to calculate in parallel */ #define N_SUMS 32 /* prime multiplier of FNV-1a hash */ #define FNV_PRIME 16777619 + /* Use a union so that this code is valid under strict aliasing */ typedef union { @@ -142,74 +144,20 @@ do { \ * Block checksum algorithm. The page must be adequately aligned * (at least on 4-byte boundary). */ -static uint32 -pg_checksum_block(const PGChecksummablePage *page) -{ - uint32 sums[N_SUMS]; - uint32 result = 0; - uint32 i, - j; - - /* ensure that the size is compatible with the algorithm */ - Assert(sizeof(PGChecksummablePage) == BLCKSZ); - - /* initialize partial checksums to their corresponding offsets */ - memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets)); - - /* main checksum calculation */ - for (i = 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++) - for (j = 0; j < N_SUMS; j++) - CHECKSUM_COMP(sums[j], page->data[i][j]); - - /* finally add in two rounds of zeroes for additional mixing */ - for (i = 0; i < 2; i++) - for (j = 0; j < N_SUMS; j++) - CHECKSUM_COMP(sums[j], 0); - - /* xor fold partial checksums together */ - for (i = 0; i < N_SUMS; i++) - result ^= sums[i]; - - return result; -} - -/* - * Compute the checksum for a Postgres page. - * - * The page must be adequately aligned (at least on a 4-byte boundary). - * Beware also that the checksum field of the page is transiently zeroed. - * - * The checksum includes the block number (to detect the case where a page is - * somehow moved to a different location), the page header (excluding the - * checksum itself), and the page data. - */ -uint16 -pg_checksum_page(char *page, BlockNumber blkno) -{ - PGChecksummablePage *cpage = (PGChecksummablePage *) page; - uint16 save_checksum; - uint32 checksum; - - /* We only calculate the checksum for properly-initialized pages */ - Assert(!PageIsNew((Page) page)); - - /* - * Save pd_checksum and temporarily set it to zero, so that the checksum - * calculation isn't affected by the old checksum stored on the page. - * Restore it after, because actually updating the checksum is NOT part of - * the API of this function. - */ - save_checksum = cpage->phdr.pd_checksum; - cpage->phdr.pd_checksum = 0; - checksum = pg_checksum_block(cpage); - cpage->phdr.pd_checksum = save_checksum; - - /* Mix in the block number to detect transposed pages */ - checksum ^= blkno; - - /* - * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of - * one. That avoids checksums of zero, which seems like a good idea. - */ - return (uint16) ((checksum % 65535) + 1); -} +#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \ +uint32 \ +pg_checksum_block_##ISANAME(const PGChecksummablePage *page); + +#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \ +pg_attribute_target(#ISANAME) \ +uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \ + +/* Declare ISA implementations (declarations only in header) */ +#ifdef USE_AVX2_WITH_RUNTIME_CHECK +PG_DECLARE_CHECKSUM_ISA(avx2); +#endif +PG_DECLARE_CHECKSUM_ISA(default); + +uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page); +extern uint32 (*pg_checksum_block)(const PGChecksummablePage *page); +extern uint16 pg_checksum_page(char *page, BlockNumber blkno); diff --git a/src/port/Makefile b/src/port/Makefile index 4274949dfa4..27423f1058b 100644 --- a/src/port/Makefile +++ b/src/port/Makefile @@ -48,6 +48,7 @@ OBJS = \ pg_numa.o \ pg_popcount_aarch64.o \ pg_popcount_avx512.o \ + pg_checksum_dispatch.o \ pg_strong_random.o \ pgcheckdir.o \ pgmkdirp.o \ diff --git a/src/port/meson.build b/src/port/meson.build index fc7b059fee5..c4bbe9f2ece 100644 --- a/src/port/meson.build +++ b/src/port/meson.build @@ -11,6 +11,7 @@ pgport_sources = [ 'pg_numa.c', 'pg_popcount_aarch64.c', 'pg_popcount_avx512.c', + 'pg_checksum_dispatch.c', 'pg_strong_random.c', 'pgcheckdir.c', 'pgmkdirp.c', diff --git a/src/port/pg_checksum_dispatch.c b/src/port/pg_checksum_dispatch.c new file mode 100644 index 00000000000..15f7b8af34f --- /dev/null +++ b/src/port/pg_checksum_dispatch.c @@ -0,0 +1,174 @@ +/*------------------------------------------------------------------------- + * + * pg_checksum_dispatch.c + * Holds the AVX2 pg_popcount() implementation. + * + * Copyright (c) 2024-2025, PostgreSQL Global Development Group + * + * IDENTIFICATION + * src/port/pg_checksum_dispatch.c + * + *------------------------------------------------------------------------- + */ +#include "c.h" +#include "storage/checksum_impl.h" + +#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT) +#include +#endif + +#ifdef HAVE_XSAVE_INTRINSICS +#include +#endif + +#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX) +#include +#endif + +#include "port/pg_bitutils.h" + +/* + * Does CPUID say there's support for XSAVE instructions? + */ +static inline bool +xsave_available(void) +{ + unsigned int exx[4] = {0, 0, 0, 0}; + +#if defined(HAVE__GET_CPUID) + __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]); +#elif defined(HAVE__CPUID) + __cpuid(exx, 1); +#elif defined(__x86_64__) +#error cpuid instruction not available +#endif + return (exx[2] & (1 << 27)) != 0; /* osxsave */ +} + +/* + * Does XGETBV say the YMM registers are enabled? + * + * NB: Caller is responsible for verifying that xsave_available() returns true + * before calling this. + */ +#ifdef HAVE_XSAVE_INTRINSICS +pg_attribute_target("xsave") +#endif +static inline bool +ymm_regs_available(void) +{ +#ifdef HAVE_XSAVE_INTRINSICS + return (_xgetbv(0) & 0x06) == 0x06; +#else + return false; +#endif +} + +/* + * Does CPUID say there's support for AVX-2 + */ +static inline bool +avx2_available(void) +{ +#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__) + unsigned int exx[4] = {0, 0, 0, 0}; + if (!xsave_available() || !ymm_regs_available()) return false; + +#if defined(HAVE__GET_CPUID_COUNT) + __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]); +#elif defined(HAVE__CPUIDEX) + __cpuidex(exx, 7, 0); +#else +#error cpuid instruction not available +#endif + return (exx[1] & (1 << 5)) != 0; /* avx2 */ +#else + return false; +#endif +} + + +/* default checksum implementation */ +PG_DEFINE_CHECKSUM_ISA(default) +{ + uint32 sums[N_SUMS], result = 0; + uint32 i, j; + + Assert(sizeof(PGChecksummablePage) == BLCKSZ); + memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets)); + + for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++) + for (j = 0; j < N_SUMS; j++) + CHECKSUM_COMP(sums[j], page->data[i][j]); + + for (i = 0; i < 2; i++) + for (j = 0; j < N_SUMS; j++) + CHECKSUM_COMP(sums[j], 0); + + for (i = 0; i < N_SUMS; i++) + result ^= sums[i]; + + return result; +} + +#ifdef USE_AVX2_WITH_RUNTIME_CHECK +PG_DEFINE_CHECKSUM_ISA(avx2) +{ + uint32 sums[N_SUMS], result = 0; + uint32 i, j; + + Assert(sizeof(PGChecksummablePage) == BLCKSZ); + memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets)); + + for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++) + for (j = 0; j < N_SUMS; j++) + CHECKSUM_COMP(sums[j], page->data[i][j]); + + for (i = 0; i < 2; i++) + for (j = 0; j < N_SUMS; j++) + CHECKSUM_COMP(sums[j], 0); + + for (i = 0; i < N_SUMS; i++) + result ^= sums[i]; + + return result; +} +#endif + +/* Function pointer - external linkage (declared extern in header) */ +uint32 (*pg_checksum_block)(const PGChecksummablePage *page) = pg_checksum_block_dispatch; + +/* Dispatch function: simple, safe */ +uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page) +{ +#ifdef USE_AVX2_WITH_RUNTIME_CHECK + if (avx2_available()) + { + /* optional: patch pointer so next call goes directly */ + pg_checksum_block = pg_checksum_block_avx2; + return pg_checksum_block_avx2(page); + } +#endif + /* fallback */ + pg_checksum_block = pg_checksum_block_default; + return pg_checksum_block_default(page); +} + + +/* Compute checksum for a Postgres page */ +uint16 pg_checksum_page(char *page, BlockNumber blkno) +{ + PGChecksummablePage *cpage = (PGChecksummablePage *) page; + uint16 save_checksum; + uint32 checksum; + + Assert(!PageIsNew((Page) page)); + + save_checksum = cpage->phdr.pd_checksum; + cpage->phdr.pd_checksum = 0; + checksum = pg_checksum_block(cpage); + cpage->phdr.pd_checksum = save_checksum; + + checksum ^= blkno; + return (uint16)((checksum % 65535) + 1); +} -- 2.43.0