Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vB35E-009eWQ-62 for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Oct 2025 03:30:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vB35C-0065iP-4i for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Oct 2025 03:30:45 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vB35B-0065i0-LO for pgsql-hackers@lists.postgresql.org; Tue, 21 Oct 2025 03:30:44 +0000 Received: from mail-qk1-x744.google.com ([2607:f8b0:4864:20::744]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vB358-002uG6-2X for pgsql-hackers@lists.postgresql.org; Tue, 21 Oct 2025 03:30:43 +0000 Received: by mail-qk1-x744.google.com with SMTP id af79cd13be357-8907af237ceso602984385a.3 for ; Mon, 20 Oct 2025 20:30:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761017442; x=1761622242; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=uLKDEdXk9+KmTr7a8ZyDFnUUglrrQA9CR3sJP/nK84k=; b=CNPRFw98PafPP+4pIe2dST574BhWKtpsGyKVCsBENscwaAR18NI94HIu92zBF3BS4D oY2TlN9fJDmQp4A8iKsPICrDYhsx9V+a1iWY0c9Bnrp6enTAKHO5UBYVGUxl4HIfMkEK wl3fCGPOLb2hISUh+d1PPC+z5cuz+vENp+zVMZYUKv9ijEkwsSiwFvybLxUa0Yy/adk+ z7SxF92/uaoNqy0D4/TWNxBnvMim+XbCL9KRIeNDhT8STb1S235JFaPqxFx3vupV3lWJ VODOdCRipGb+UWzp8ZfvKYsVurvoevtfOOMktCo270d/EThUY7UZKVXpyn//4UbPkySg L2Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761017442; x=1761622242; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uLKDEdXk9+KmTr7a8ZyDFnUUglrrQA9CR3sJP/nK84k=; b=MEbVPwFvk5PnYBF7RQBH/H1YIqLn0qpC7pZgC9j41xXyeamssgJOnfxBIMy7tLpKND HK37hwX5jgy/s0JddSoaGnI6lMmrGWr8rJS3gWjQk8Ul6apRxnop91G6aoEnu37bKEVM KKCh2W0UiLJrZB5Zcna34djRGqVerHwhQEK7S8GuNmEUz5+XKzyGHMQT2Tz8WYhgs2Go TU6bj99BqmWi2IFZNVcqdZvYfRtKYInPNtETTkP6Nd5pD5X8CKLCg+pyjC/Bcz/4mQUC +Dn9j0O/xgZWsEAmFgaPe/fbBE66DUXNeKh9ctb+JJnUpsBKA0d2i97Ia7VpKqbtFsDg GHrA== X-Gm-Message-State: AOJu0YxAEZOY56ULt6p+9gWsgrXBltoXQKCG0nJ+O/o94YL7tNsNtKwm evzZw9Fg7cFJpSziCqbqwY0g5jtEflmmgU6WmnANzalfJKTp/dOwFsx4b5Nfw3gPAMbLpqvrnUY JoJQkc8ltkMlgM23ynmjODYvnGyJ0Vik= X-Gm-Gg: ASbGncuRmFxJAXjZFGk9n4+8bcDQGLpoO+TTw4DeHMq+72GT4BM0Hq4ZP/jh3yiGw7X 7WfkJki5E6evX93HgqlGm92XYuJFOvQin9YVZtPL7Jqy30ESFjfoA3fAMzgrIuTDqjLIMKUfvcR Ej+kain+hRbL8v6Pa3wqFFrLJd/k7WsLNB8ZhBnsEXnWxcQqo0ZahRLuk7khduv4agH3cfD++ba AAYh3FO9Z2PybqszezWV8RIYoB4p0kyV4b108ZFsCwWuJ38qU3X8S5bAnIoHOtbtHm8laWww0J8 l/TRPx1Ws7IoBBCOBHzZ5k3TMIfHQdVDgMuAfhdQajgwQz3cKkBUvWT8tEBiNOJfVA9Stvmrruk Nzyz8 X-Google-Smtp-Source: AGHT+IEpRjxGX/fa34oq3ESViyPvo3hq+K9sHojJ6Eg1lU+fD+aWjDUhoi4ytxrMSC9oAoEATBX+wmG+vM1rBBqUpTI= X-Received: by 2002:a05:620a:c4c:b0:838:d547:a88e with SMTP id af79cd13be357-8906e7b8cddmr1836277085a.13.1761017441779; Mon, 20 Oct 2025 20:30:41 -0700 (PDT) MIME-Version: 1.0 References: <20250911054220.3784-1-root@ip-172-31-36-228.ec2.internal> In-Reply-To: From: John Naylor Date: Tue, 21 Oct 2025 10:30:30 +0700 X-Gm-Features: AS18NWDB1JFb2G0Ara_5bG35y7Qxk9F6bgjDC0eZxC01ag-UKkZMuXN_Fi4GB30 Message-ID: Subject: Re: Proposal for enabling auto-vectorization for checksum calculations To: Andrew Kim Cc: pgsql-hackers@lists.postgresql.org, Oleg Tselebrovskiy Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Oct 17, 2025 at 2:15=E2=80=AFPM Andrew Kim = wrote: > > Hi John, > > Thank you for your detailed and constructive feedback on the checksum > AVX2 optimization patch. > I've carefully addressed all of your concerns and am pleased to share > the updated V6 implementation. Great! I know we're on v7 now, but I'm going to make a request for next time you respond to a review: Respond in-line to each point. As I mentioned before, > On Wed, Oct 1, 2025 at 10:26=E2=80=AFPM John Naylor wrote: > > (BTW, we discourage top-posting and prefer to cut to size and > > use inline responses) Please don't top-post again, as it clutters our archives in addition to making it easy to forget things. I'm now going to copy the things that were either not addressed or misunderstood: > > I think a good first refactoring patch would be to move > > src/backend/storage/checksum.c (which your patch doesn't even touch) > > to src/port (and src/include/storage/checksum.h to src/include/port) > > and have all callers use that. With that, I imagine only that > > checksum.c file would include checksum_impl.h. > > > > If that poses a problem, let us know -- we may have to further juggle > > things. If that works without issue, we can proceed with the > > specialization. That means the first patch moves things around without adding any platform-specific code, and the next patch adds the specialization. I think that would be a lot easier to review and test, especially to avoid breaking external programs (see below for more on this). A committer can always squash things together if it make sense to do so. > > + #if defined(__has_attribute) && __has_attribute (target) > > + __attribute__((target("avx2"))) > > + #endif > > + static int avx2_test(void) > > + { > > + const char buf@<:@sizeof(__m256i)@:>@; > > + __m256i accum =3D _mm256_loadu_si256((const __m256i *) buf); > > + accum =3D _mm256_add_epi32(accum, accum); > > + int result =3D _mm256_extract_epi32(accum, 0); > > + return (int) result; > > + }], > > > > If we're just testing if the target works, we can just use an empty > > function, right? Oleg mentioned the same thing later. It's a waste of time for us to repeat ourselves. I said you didn't have to worry about it yet, because I was hoping to see the refactoring first. Now, aside from that I looked further into this: > > The top of the checksum_impl.h has this: > > > > * This file exists for the benefit of external programs that may wish = to > > * check Postgres page checksums. They can #include this to get the co= de > > * referenced by storage/checksum.h. (Note: you may need to redefine > > * Assert() as empty to compile this successfully externally.) > > > > It's going to be a bit tricky to preserve this ability while allowing > > the core server and client programs to dispatch to a specialized > > implementation, but we should at least try. That means keeping > > pg_checksum_block() and pg_checksum_page() where they live now. Looking at commit f04216341dd1, we have at least one example of an external program, pg_filedump. If we can keep this working with minimal fuss, it should be fine everywhere. https://github.com/df7cb/pg_filedump/blob/master/pg_filedump.c#L29 ``` /* checksum_impl.h uses Assert, which doesn't work outside the server */ #undef Assert #define Assert(X) #include "storage/checksum.h" #include "storage/checksum_impl.h" ``` Elsewhere they already have to do things like ``` #if PG_VERSION_NUM < 110000 " Previous Checkpoint Record: Log File (%u) Offset (0x%08x)\n" #endif ``` ...so it's probably okay if they have to adjust for a new #include path, but I want to verify that actually works, and I don't want to make it any more invasive than that. As we proceed, I can volunteer to do the work to test that pg_filedump still builds fine with small changes. Feel free to try building it yourself, but I'm happy to do it. Oleg posted another review recently, so I won't complicate things further, but from a brief glance I will suggest for next time not to change any comments that haven't been invalidated by the patch. -- John Naylor Amazon Web Services