Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCZQ9-0027TP-14 for pgsql-hackers@arkaria.postgresql.org; Tue, 14 Apr 2026 08:46:57 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wCZQ7-00AncL-0s for pgsql-hackers@arkaria.postgresql.org; Tue, 14 Apr 2026 08:46:56 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCZQ6-00AncA-33 for pgsql-hackers@lists.postgresql.org; Tue, 14 Apr 2026 08:46:55 +0000 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wCZQ5-00000000zcT-2r6x for pgsql-hackers@lists.postgresql.org; Tue, 14 Apr 2026 08:46:55 +0000 Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-2b4583f0a1aso13447145ad.3 for ; Tue, 14 Apr 2026 01:46:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776156412; x=1776761212; darn=lists.postgresql.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=xzs/uZIArvl3mzor2zeuktXOSo4N7268Ofsk8tD5D/k=; b=QZtBinEQKoEoDCpyWwo2MM/J+8o/hm+MIBJF4CnJAIN7Ii02OQarCupmf0rySgNkEB YPjfWdG878Apblu1v6DBlNhcTHVEl5ekbziDSJyaO4gX9Hgpmwocw0t3KdTyqLIm6m1O fm7HIAwEoru1DqrEfz5J4wP0rOuxnxFLO0pr3Gam9ZJwW2lax0IqTQjIH5u9Vje9GkG1 1AA8UpQvIPqHZqr6OvQWnLfY+aAPWspMw1uPrABgOo0/+hR+F8PFr67yRh+VRRv6paLL Dd0AGHmKXeh+5gRIJGVHB/4SjLyZuyaFwz2NiXIHmWnGY12uUWWnhoFnkjTMO218brQ7 OEUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776156412; x=1776761212; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xzs/uZIArvl3mzor2zeuktXOSo4N7268Ofsk8tD5D/k=; b=VOurrzPbeuKaMQObtP+VZN9CgKLp3jy+GNplLL+oSuwgYxS27DqjYJP8Ur3vESJVpd Lc4WHjbFmmEVT8rQOCP6oPHQsp2j7mWMiTf8KT09cvorey+MgbdvI2CpClsudY+Vsuwj 8pL/Y9/tSeB1uq7LM4BVwrb30XSEmUbBzJvhgdn0KPm/f2EjJ+E8o7cOqAAe5jnTgqOD 8wt8V+6/eT/f/PQ6vbeWZ4Rg6Sx7j5YJGq09+b8hrAWPjA2rDyxBhhpfOsuBaYnVAVZI KjOOWVh/H34sh/tMlc2B/vP2EwbHHQmHuiRIjCwtldkKMKqAKSL4cYZzXOLDF0jui7Rp GQow== X-Gm-Message-State: AOJu0YwKy2jOVv1XEvrLcFxvKq+qQlgCjVN8U6e/ZPJTnrPpcgT4KyKB W8/RAB5micwgVPF6Y7Xb6TPrK+LYiBMN5KcJV8jMWfjpT+DLN6MDGRzvSsxscQaWFQc= X-Gm-Gg: AeBDievIklPDHajAJPjra26w+9m/8qmzbuzyqS98LSb7YeqYWJhDgHXC8q4AmRUliXh oc/mOy86R9xjGU1BC/IGz4bCiORWxVLdPeV7LXe17c+x7AClccfQ/XnPYMMVriK7j+1fqcLERxW kBePwwnbU4pSfRITuepo+qnweLxQ/tGnLrtWWJXLr8x4AlPspGuD8dYHkYdPBTlZGroLfWQtl+D cJ82SD0I1BU4I1JF9pdtGb+hjC1GphwEi7yyOGyZxHieoTzD7Wy9kD24P3NmeNYBFjv8jqw8ViG cubtvT7TMO9071FL44tT/n08ErYqb1QckwAu4iLImvMZTIplr/sjSHn4X3aUPztW5ScsmYePIn/ GXemSdLhtegmRTooLGcUOa0et5OIM4ZKCINkfGr4OZdnG5yzj/WchbbTUSPhEeKQb56Z+cnjGHz kzokGTT/L8nWWBMtU8Tzk3AYIYKZ9uV/o= X-Received: by 2002:a17:903:1103:b0:2b0:5a4c:726a with SMTP id d9443c01a7336-2b2d5aa7fa2mr168355705ad.43.1776156411560; Tue, 14 Apr 2026 01:46:51 -0700 (PDT) Received: from smtpclient.apple ([45.32.121.103]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b2d4f25d32sm144384405ad.57.2026.04.14.01.46.49 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Apr 2026 01:46:50 -0700 (PDT) From: Chao Li Message-Id: <278B9FE3-F349-4494-99C5-483105C1C999@gmail.com> Content-Type: multipart/mixed; boundary="Apple-Mail=_E88B1AFD-49AE-4390-BE40-3D28F579A457" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.400.21\)) Subject: Re: Add bms_offset_members() function for bitshifting Bitmapsets Date: Tue, 14 Apr 2026 16:46:11 +0800 In-Reply-To: Cc: PostgreSQL Developers To: David Rowley References: X-Mailer: Apple Mail (2.3864.400.21) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Mail=_E88B1AFD-49AE-4390-BE40-3D28F579A457 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > On Apr 13, 2026, at 12:35, David Rowley wrote: >=20 > (v20 material) >=20 > While working on some new code which required offsetting the members > of a Bitmapset, I decided to go and write a function to do this rather > than copy the various other places where we manually construct a new > set with a bms_next_member() -> bms_add_member() loop. The new use > case I have is from pulling varattnos from a scan's targetlist, which > there could be several hundred Vars in. I considered this might be > noticeably expensive. >=20 > The current manual way we have of doing this is a bit laborious since > we're only doing 1 member per bms_next_member() loop, and also, if the > set has multiple words, we may end up doing several repallocs to > expand the set, perhaps as little as 1 word at a time. That's not to > mention the rather repetitive code that we have to do this in various > places that might be nice to consolidate. >=20 > I've attached a patch which adds bms_offset_members(), which does > bitshifting to move the members up or down by the given offset. While > working on this I made a few choices which might be worth a revisit: >=20 > 1) The function modifies the given set in-place rather than making a = new set. > 2) The function will ERROR if any member would go above INT_MAX. These > would be inaccessible, and that seems weird/wrong. > 3) When offsetting by a negative value, members that would go below > zero fall out of the set silently. >=20 > For #1; my original use-case that made me write this didn't need a > copy, so I wrote the function to modify the set in-place. After > hunting down and replacing the relevant existing bms_next_member() > loops with the new function, I noticed all these seem to need a copy. > Currently, I have coded the patch to do > bms_offset_members(bms_copy(set), ...), but that's a little > inefficient as it *could* result in a palloc for the copy then a > repalloc in the offset. If bms_offset_members() just created a new > set, then it could palloc() a set to the exact required size. The > counterargument to that is that I don't really want to copy the set > for my intended use case. I thought about two versions, but thought > that might be overkill. There could be a boolean parameter to define > that, but we don't do that anywhere else in bitmapset.c, or we could > avoid a copy-paste of the code with an always-inlined helper function. > I couldn't decide, so left it as-is. >=20 > For #2, I could have equally made these fall off the top of the set, > but I thought we might want to know about it in the unlikely event > that this ever happens. >=20 > #3 We commonly want to get rid of system columns from a > pull_varattnos() set which is offset by > FirstLowInvalidHeapAttributeNumber, so those disappearing silently is > what most use cases seem to want. I expect that's not for revisiting, > but I listed this one anyway as it flies in the face of #2. >=20 > Patch attached. Comments welcome. >=20 > David > I basically agree with design choices 1/2/3. And the implementation of = v1 overall looks good to me. The only issue I found is this overflow check: ``` + /* Handle a positive offset (bitshift left) */ + if (offset > 0) + { + /* + * An unlikely scenario, but check we're not going to = create a member + * greater than PG_INT32_MAX. + */ + if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + = high_bit + offset_bits > PG_INT32_MAX) + elog(ERROR, "bitmapset overflow"); ``` This overflow check seems wrong. Because when high_bit + offset_bits > = BITS_PER_BITMAPWORD, new_nwords has been increased by 1, so there = high_bit + offset_bits are double counted. To verify that, I added a new test: ``` -- 2147483583 is PG_INT32_MAX - 64, so offsetting by 1 should succeed, -- but offsetting it by 65 should fail with overflow error SELECT test_random_offset_operations_check_highest(2147483583, 1) AS = result; SELECT test_random_offset_operations_check_highest(2147483583, 65) AS = result; ``` With v1, test_random_offset_operations_check_highest(2147483583, 1) = reports an overflow error, but it should not. Please see the attached diff for the test I added. In that diff, I also = included a fix, and with that fix, the tests pass. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/ --Apple-Mail=_E88B1AFD-49AE-4390-BE40-3D28F579A457 Content-Disposition: attachment; filename=nocfbot_chao_test.diff Content-Type: application/octet-stream; x-unix-mode=0644; name="nocfbot_chao_test.diff" Content-Transfer-Encoding: 7bit diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c index 48514a3b781..50201780204 100644 --- a/src/backend/nodes/bitmapset.c +++ b/src/backend/nodes/bitmapset.c @@ -1010,6 +1010,7 @@ bms_offset_members(Bitmapset *a, int offset) int new_nwords; int old_nwords; int32 high_bit; + int64 old_highest; Assert(bms_is_valid_set(a)); @@ -1022,6 +1023,7 @@ bms_offset_members(Bitmapset *a, int offset) offset_bits = BITNUM(offset); new_nwords = a->nwords + offset_words; high_bit = bmw_leftmost_one_pos(a->words[a->nwords - 1]); + old_highest = ((int64) a->nwords - 1) * BITS_PER_BITMAPWORD + high_bit; /* Adjust the number of words in the new set to how many we'll need */ new_nwords += (offset_bits + high_bit) >= BITS_PER_BITMAPWORD; @@ -1041,7 +1043,8 @@ bms_offset_members(Bitmapset *a, int offset) * An unlikely scenario, but check we're not going to create a member * greater than PG_INT32_MAX. */ - if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + high_bit + offset_bits > PG_INT32_MAX) + /*if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + high_bit + offset_bits > PG_INT32_MAX)*/ + if (old_highest + offset > PG_INT32_MAX) elog(ERROR, "bitmapset overflow"); /* diff --git a/src/test/modules/test_bitmapset/expected/test_bitmapset.out b/src/test/modules/test_bitmapset/expected/test_bitmapset.out index 3a6f08427cd..984174cfaf7 100644 --- a/src/test/modules/test_bitmapset/expected/test_bitmapset.out +++ b/src/test/modules/test_bitmapset/expected/test_bitmapset.out @@ -1613,4 +1613,14 @@ SELECT test_random_offset_operations(-1, 10000, 1024, 0) AS result; 10000 (1 row) +-- 2147483583 is PG_INT32_MAX - 64, so offsetting by 1 should succeed, +-- but offsetting it by 65 should fail with overflow error +SELECT test_random_offset_operations_check_highest(2147483583, 1) AS result; + result +------------ + 2147483584 +(1 row) + +SELECT test_random_offset_operations_check_highest(2147483583, 65) AS result; +ERROR: bitmapset overflow DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql index 5c5cfe3fc15..072d264ad9a 100644 --- a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql +++ b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql @@ -416,4 +416,9 @@ SELECT test_random_operations(-1, 10000, 81920, 0) > 0 AS result; -- perform some random test on bms_offset_members() SELECT test_random_offset_operations(-1, 10000, 1024, 0) AS result; +-- 2147483583 is PG_INT32_MAX - 64, so offsetting by 1 should succeed, +-- but offsetting it by 65 should fail with overflow error +SELECT test_random_offset_operations_check_highest(2147483583, 1) AS result; +SELECT test_random_offset_operations_check_highest(2147483583, 65) AS result; + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql index ea12ec40057..88bf6719aa6 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql +++ b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql @@ -145,4 +145,8 @@ CREATE FUNCTION test_random_offset_operations(integer, integer, integer, integer RETURNS integer STRICT AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_random_offset_operations_check_highest(integer, integer) +RETURNS integer STRICT +AS 'MODULE_PATHNAME' LANGUAGE C; + COMMENT ON EXTENSION test_bitmapset IS 'Test code for Bitmapset'; diff --git a/src/test/modules/test_bitmapset/test_bitmapset.c b/src/test/modules/test_bitmapset/test_bitmapset.c index 1fbf9ffc200..59da3fb78ca 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset.c +++ b/src/test/modules/test_bitmapset/test_bitmapset.c @@ -67,6 +67,7 @@ PG_FUNCTION_INFO_V1(test_bitmap_match); /* Test utility functions */ PG_FUNCTION_INFO_V1(test_random_operations); PG_FUNCTION_INFO_V1(test_random_offset_operations); +PG_FUNCTION_INFO_V1(test_random_offset_operations_check_highest); /* Convenient macros to test results */ #define EXPECT_TRUE(expr) \ @@ -827,3 +828,20 @@ test_random_offset_operations(PG_FUNCTION_ARGS) PG_RETURN_INT32(num_ops); } + +Datum +test_random_offset_operations_check_highest(PG_FUNCTION_ARGS) +{ + int highest_member = 0; + int offset = 0; + Bitmapset *bms; + + highest_member = PG_GETARG_INT32(0); + offset = PG_GETARG_INT32(1); + + bms = bms_add_member(NULL, highest_member); + bms = bms_offset_members(bms, offset); + highest_member = bms_next_member(bms, -1); + + PG_RETURN_INT32(highest_member); +} \ No newline at end of file --Apple-Mail=_E88B1AFD-49AE-4390-BE40-3D28F579A457--