public inbox for [email protected]help / color / mirror / Atom feed
Add bms_offset_members() function for bitshifting Bitmapsets 15+ messages / 5 participants [nested] [flat]
* Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-13 04:35 David Rowley <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: David Rowley @ 2026-04-13 04:35 UTC (permalink / raw) To: PostgreSQL Developers <[email protected]> (v20 material) While working on some new code which required offsetting the members of a Bitmapset, I decided to go and write a function to do this rather than copy the various other places where we manually construct a new set with a bms_next_member() -> bms_add_member() loop. The new use case I have is from pulling varattnos from a scan's targetlist, which there could be several hundred Vars in. I considered this might be noticeably expensive. The current manual way we have of doing this is a bit laborious since we're only doing 1 member per bms_next_member() loop, and also, if the set has multiple words, we may end up doing several repallocs to expand the set, perhaps as little as 1 word at a time. That's not to mention the rather repetitive code that we have to do this in various places that might be nice to consolidate. I've attached a patch which adds bms_offset_members(), which does bitshifting to move the members up or down by the given offset. While working on this I made a few choices which might be worth a revisit: 1) The function modifies the given set in-place rather than making a new set. 2) The function will ERROR if any member would go above INT_MAX. These would be inaccessible, and that seems weird/wrong. 3) When offsetting by a negative value, members that would go below zero fall out of the set silently. For #1; my original use-case that made me write this didn't need a copy, so I wrote the function to modify the set in-place. After hunting down and replacing the relevant existing bms_next_member() loops with the new function, I noticed all these seem to need a copy. Currently, I have coded the patch to do bms_offset_members(bms_copy(set), ...), but that's a little inefficient as it *could* result in a palloc for the copy then a repalloc in the offset. If bms_offset_members() just created a new set, then it could palloc() a set to the exact required size. The counterargument to that is that I don't really want to copy the set for my intended use case. I thought about two versions, but thought that might be overkill. There could be a boolean parameter to define that, but we don't do that anywhere else in bitmapset.c, or we could avoid a copy-paste of the code with an always-inlined helper function. I couldn't decide, so left it as-is. For #2, I could have equally made these fall off the top of the set, but I thought we might want to know about it in the unlikely event that this ever happens. #3 We commonly want to get rid of system columns from a pull_varattnos() set which is offset by FirstLowInvalidHeapAttributeNumber, so those disappearing silently is what most use cases seem to want. I expect that's not for revisiting, but I listed this one anyway as it flies in the face of #2. Patch attached. Comments welcome. David Attachments: [application/octet-stream] v1-0001-Introduce-bms_offset_members-function.patch (17.9K, 2-v1-0001-Introduce-bms_offset_members-function.patch) download | inline diff: From 4da8179f707a20e5d01955bbe1497ac6b3ebc2a4 Mon Sep 17 00:00:00 2001 From: David Rowley <[email protected]> Date: Fri, 20 Mar 2026 11:01:13 +1300 Subject: [PATCH v1] Introduce bms_offset_members() function Effectively a function to bitshift members by the specified number of bits. We have various fragments of code doing this manually with a bms_next_member() loop. We can do this more efficiently with bitshifting. --- src/backend/nodes/bitmapset.c | 158 ++++++++++++++++++ src/backend/optimizer/plan/setrefs.c | 10 +- src/backend/optimizer/prep/prepjointree.c | 28 ++-- src/backend/rewrite/rewriteManip.c | 8 +- src/backend/statistics/extended_stats.c | 12 +- src/include/nodes/bitmapset.h | 1 + .../expected/test_bitmapset.out | 38 +++++ .../test_bitmapset/sql/test_bitmapset.sql | 13 ++ .../test_bitmapset/test_bitmapset--1.0.sql | 8 + .../modules/test_bitmapset/test_bitmapset.c | 87 +++++++++- 10 files changed, 324 insertions(+), 39 deletions(-) diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c index f053d8c4d64..48514a3b781 100644 --- a/src/backend/nodes/bitmapset.c +++ b/src/backend/nodes/bitmapset.c @@ -991,6 +991,164 @@ bms_replace_members(Bitmapset *a, const Bitmapset *b) return a; } +/* + * bms_offset_members + * Adjust all existing members of the given set by adding 'offset' to + * each member. The offset value may be negative or positive. Members + * which would become negative as a result of a negative offset will be + * removed from the set, whereas too large an offset which would result + * in a member going > INT_MAX will result in an ERROR. + * + * The set is modified in-place when possible, but callers must assume the set + * can be reallocated and use the returned set. + */ +Bitmapset * +bms_offset_members(Bitmapset *a, int offset) +{ + int offset_words; + int offset_bits; + int new_nwords; + int old_nwords; + int32 high_bit; + + Assert(bms_is_valid_set(a)); + + /* Don't go to any effort if there's nothing to do */ + if (a == NULL) + return NULL; + + old_nwords = a->nwords; + offset_words = WORDNUM(offset); + offset_bits = BITNUM(offset); + new_nwords = a->nwords + offset_words; + high_bit = bmw_leftmost_one_pos(a->words[a->nwords - 1]); + + /* Adjust the number of words in the new set to how many we'll need */ + new_nwords += (offset_bits + high_bit) >= BITS_PER_BITMAPWORD; + new_nwords -= (offset_bits + high_bit) < 0; + + /* If we don't need any words then it must be an empty set */ + if (new_nwords <= 0) + { + pfree(a); + return NULL; + } + + /* Handle a positive offset (bitshift left) */ + if (offset > 0) + { + /* + * An unlikely scenario, but check we're not going to create a member + * greater than PG_INT32_MAX. + */ + if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + high_bit + offset_bits > PG_INT32_MAX) + elog(ERROR, "bitmapset overflow"); + + /* + * Reallocate the memory for the set if we need more words, otherwise + * we reuse the existing set. + */ + if (new_nwords > a->nwords) + a = (Bitmapset *) repalloc0(a, BITMAPSET_SIZE(old_nwords), + BITMAPSET_SIZE(new_nwords)); + + a->nwords = new_nwords; + + /* Shift words up to higher array elements, if needed */ + if (offset_words > 0) + { + for (int i = old_nwords - 1; i >= 0; i--) + a->words[i + offset_words] = a->words[i]; + + /* zero out the old members from the lower, now unused words */ + for (int i = 0; i < offset_words; i++) + a->words[i] = 0; + } + + /* Now handle shifting up the bits, if needed */ + if (offset_bits > 0) + { + int carry_bits = BITS_PER_BITMAPWORD - offset_bits; + bitmapword prev_carry = 0; + + for (int i = offset_words; i < new_nwords; i++) + { + bitmapword carry = (a->words[i] >> carry_bits); + + /* shift bits up and carry the bits from the previous word */ + a->words[i] = (a->words[i] << offset_bits) | prev_carry; + prev_carry = carry; + } + } + } + + /* Handle negative offset (bitshift right) */ + else + { + /* Make the negative shift_words and shift_bits positive numbers */ + offset_words = 0 - offset_words; + offset_bits = 0 - offset_bits; + + /* + * Shrink the set to store the new number of words. We can't repalloc + * here as we need to still access the memory for the upper words to + * move them. In any case, repalloc with a smaller allocation likely + * is just going to return the same chunk, so doing so would just be + * needless overhead. + */ + a->nwords = new_nwords; + + /* Shift words down to lower elements, if needed */ + if (offset_words > 0) + { + for (int i = 0; i <= new_nwords; i++) + a->words[i] = (i + offset_words) >= old_nwords ? 0 : a->words[i + offset_words]; + + /* + * Don't bother with zeroing the upper words. They're no longer + * visible due to reducing nwords down to the new size. + */ + } + + /* Now handle shifting down the bits, if needed */ + if (offset_bits > 0) + { + int carry_bits = BITS_PER_BITMAPWORD - offset_bits; + bitmapword prev_carry = 0; + + if (new_nwords < old_nwords) + prev_carry = (a->words[new_nwords] << carry_bits); + + for (int i = new_nwords - 1; i >= 0; i--) + { + bitmapword carry = (a->words[i] << carry_bits); + + /* shift bits down and carry the bits from the previous word */ + a->words[i] = (a->words[i] >> offset_bits) | prev_carry; + prev_carry = carry; + } + } + } + +#ifdef REALLOCATE_BITMAPSETS + + /* + * There's no guarantee that the repalloc returned a new pointer, so copy + * and free unconditionally here. + */ + a = bms_copy_and_free(a); +#else + + /* + * That was a fair bit of bit twiddling, so make sure we've still got a + * valid set with no leading zero words + */ + Assert(bms_is_valid_set(a)); +#endif + + return a; +} + /* * bms_add_range * Add members in the range of 'lower' to 'upper' to the set. diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c index ff0e875f2a2..86a8d993997 100644 --- a/src/backend/optimizer/plan/setrefs.c +++ b/src/backend/optimizer/plan/setrefs.c @@ -2057,16 +2057,10 @@ set_hash_references(PlannerInfo *root, Plan *plan, int rtoffset) static Relids offset_relid_set(Relids relids, int rtoffset) { - Relids result = NULL; - int rtindex; - - /* If there's no offset to apply, we needn't recompute the value */ + /* If there's no offset to apply, we needn't make a copy */ if (rtoffset == 0) return relids; - rtindex = -1; - while ((rtindex = bms_next_member(relids, rtindex)) >= 0) - result = bms_add_member(result, rtindex + rtoffset); - return result; + return bms_offset_members(bms_copy(relids), rtoffset); } /* diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 95bf51606cc..072fd47db9e 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -3751,22 +3751,22 @@ has_notnull_forced_var(PlannerInfo *root, List *forced_null_vars, if (bms_is_member(varno, right_state->nullable_rels)) continue; - /* - * Iterate over attributes and adjust the bitmap indexes by - * FirstLowInvalidHeapAttributeNumber to get the actual attribute - * numbers. - */ - attno = -1; - while ((attno = bms_next_member(attrs, attno)) >= 0) - { - AttrNumber real_attno = attno + FirstLowInvalidHeapAttributeNumber; + /* Find the lowest member to check if system columns are present */ + attno = bms_next_member(attrs, -1); - /* system columns cannot be NULL */ - if (real_attno < 0) - return true; + /* We checked for an empty set above */ + Assert(attno >= 0); - forcednullattnums = bms_add_member(forcednullattnums, real_attno); - } + /* system columns cannot be NULL */ + if (attno + FirstLowInvalidHeapAttributeNumber < 0) + return true; + + /* + * Adjust the bitmap indexes by FirstLowInvalidHeapAttributeNumber to + * get the actual attribute numbers. + */ + forcednullattnums = bms_offset_members(bms_copy(attrs), + FirstLowInvalidHeapAttributeNumber); rte = rt_fetch(varno, root->parse->rtable); diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c index 4bf4aa0d6d1..e1017ddf42c 100644 --- a/src/backend/rewrite/rewriteManip.c +++ b/src/backend/rewrite/rewriteManip.c @@ -527,13 +527,7 @@ OffsetVarNodes(Node *node, int offset, int sublevels_up) static Relids offset_relid_set(Relids relids, int offset) { - Relids result = NULL; - int rtindex; - - rtindex = -1; - while ((rtindex = bms_next_member(relids, rtindex)) >= 0) - result = bms_add_member(result, rtindex + offset); - return result; + return bms_offset_members(bms_copy(relids), offset); } /* diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 2b83355d26e..f2c0e2d35d9 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1673,8 +1673,7 @@ statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid, */ if (!leakproof) { - Bitmapset *clause_attnums = NULL; - int attnum = -1; + Bitmapset *clause_attnums; /* * We have to check per-column privileges. *attnums has the attnums @@ -1685,13 +1684,8 @@ statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid, * while attnums within *attnums aren't. Convert *attnums to the * offset style so we can combine the results. */ - while ((attnum = bms_next_member(*attnums, attnum)) >= 0) - { - clause_attnums = - bms_add_member(clause_attnums, - attnum - FirstLowInvalidHeapAttributeNumber); - } - + clause_attnums = bms_offset_members(bms_copy(*attnums), + 0 - FirstLowInvalidHeapAttributeNumber); /* Now merge attnums from *exprs into clause_attnums */ if (*exprs != NIL) pull_varattnos((Node *) *exprs, relid, &clause_attnums); diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h index 067ec72e99b..748fa2ad58b 100644 --- a/src/include/nodes/bitmapset.h +++ b/src/include/nodes/bitmapset.h @@ -123,6 +123,7 @@ extern Bitmapset *bms_add_member(Bitmapset *a, int x); extern Bitmapset *bms_del_member(Bitmapset *a, int x); extern Bitmapset *bms_add_members(Bitmapset *a, const Bitmapset *b); extern Bitmapset *bms_replace_members(Bitmapset *a, const Bitmapset *b); +extern Bitmapset *bms_offset_members(Bitmapset *a, int offset); extern Bitmapset *bms_add_range(Bitmapset *a, int lower, int upper); extern Bitmapset *bms_int_members(Bitmapset *a, const Bitmapset *b); extern Bitmapset *bms_del_members(Bitmapset *a, const Bitmapset *b); diff --git a/src/test/modules/test_bitmapset/expected/test_bitmapset.out b/src/test/modules/test_bitmapset/expected/test_bitmapset.out index f75cb46b869..3a6f08427cd 100644 --- a/src/test/modules/test_bitmapset/expected/test_bitmapset.out +++ b/src/test/modules/test_bitmapset/expected/test_bitmapset.out @@ -760,6 +760,37 @@ SELECT test_bms_compare(NULL, NULL) AS result; 0 (1 row) +-- bms_offset_members() +-- Ensure overflow detection works +SELECT test_bms_offset_members('(b 1)', 2147483647); +ERROR: bitmapset overflow +SELECT test_bms_offset_members('(b 2)', 2147483646); +ERROR: bitmapset overflow +-- Ensure members are all offset as expected +SELECT test_bms_offset_members('(b 1 3 5)', 1); + test_bms_offset_members +------------------------- + (b 2 4 6) +(1 row) + +SELECT test_bms_offset_members('(b 1 3 5)', -1); + test_bms_offset_members +------------------------- + (b 0 2 4) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', 1); + test_bms_offset_members +------------------------- + (b 32 33 64 65) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', -1); + test_bms_offset_members +------------------------- + (b 30 31 62 63) +(1 row) + -- bms_add_range() SELECT test_bms_add_range('(b)', -5, 10); -- error ERROR: negative bitmapset member not allowed @@ -1575,4 +1606,11 @@ SELECT test_random_operations(-1, 10000, 81920, 0) > 0 AS result; t (1 row) +-- perform some random test on bms_offset_members() +SELECT test_random_offset_operations(-1, 10000, 1024, 0) AS result; + result +-------- + 10000 +(1 row) + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql index e75ab8f620a..5c5cfe3fc15 100644 --- a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql +++ b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql @@ -201,6 +201,16 @@ SELECT test_bms_compare('(b 5)', NULL) AS result; SELECT test_bms_compare(NULL, '(b 5)') AS result; SELECT test_bms_compare(NULL, NULL) AS result; +-- bms_offset_members() +-- Ensure overflow detection works +SELECT test_bms_offset_members('(b 1)', 2147483647); +SELECT test_bms_offset_members('(b 2)', 2147483646); +-- Ensure members are all offset as expected +SELECT test_bms_offset_members('(b 1 3 5)', 1); +SELECT test_bms_offset_members('(b 1 3 5)', -1); +SELECT test_bms_offset_members('(b 31 32 63 64)', 1); +SELECT test_bms_offset_members('(b 31 32 63 64)', -1); + -- bms_add_range() SELECT test_bms_add_range('(b)', -5, 10); -- error SELECT test_bms_add_range('(b)', 5, 7) AS result; @@ -403,4 +413,7 @@ SELECT test_bms_nonempty_difference('(b 1 2)', '(b 50 100)') AS result; -- random operations SELECT test_random_operations(-1, 10000, 81920, 0) > 0 AS result; +-- perform some random test on bms_offset_members() +SELECT test_random_offset_operations(-1, 10000, 1024, 0) AS result; + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql index 227ecb5aa3b..ea12ec40057 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql +++ b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql @@ -120,6 +120,10 @@ CREATE FUNCTION test_bms_replace_members(text, text) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_bms_offset_members(text, integer) +RETURNS text +AS 'MODULE_PATHNAME' LANGUAGE C; + CREATE FUNCTION test_bms_join(text, text) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C; @@ -137,4 +141,8 @@ CREATE FUNCTION test_random_operations(integer, integer, integer, integer) RETURNS integer STRICT AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_random_offset_operations(integer, integer, integer, integer) +RETURNS integer STRICT +AS 'MODULE_PATHNAME' LANGUAGE C; + COMMENT ON EXTENSION test_bitmapset IS 'Test code for Bitmapset'; diff --git a/src/test/modules/test_bitmapset/test_bitmapset.c b/src/test/modules/test_bitmapset/test_bitmapset.c index 272bed390a6..1fbf9ffc200 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset.c +++ b/src/test/modules/test_bitmapset/test_bitmapset.c @@ -59,12 +59,14 @@ PG_FUNCTION_INFO_V1(test_bms_add_members); PG_FUNCTION_INFO_V1(test_bms_int_members); PG_FUNCTION_INFO_V1(test_bms_del_members); PG_FUNCTION_INFO_V1(test_bms_replace_members); +PG_FUNCTION_INFO_V1(test_bms_offset_members); PG_FUNCTION_INFO_V1(test_bms_join); PG_FUNCTION_INFO_V1(test_bitmap_hash); PG_FUNCTION_INFO_V1(test_bitmap_match); /* Test utility functions */ PG_FUNCTION_INFO_V1(test_random_operations); +PG_FUNCTION_INFO_V1(test_random_offset_operations); /* Convenient macros to test results */ #define EXPECT_TRUE(expr) \ @@ -530,6 +532,18 @@ test_bms_replace_members(PG_FUNCTION_ARGS) PG_RETURN_BITMAPSET_AS_TEXT(bms1); } +Datum +test_bms_offset_members(PG_FUNCTION_ARGS) +{ + Bitmapset *bms = PG_ARG_GETBITMAPSET(0); + int offset = PG_GETARG_INT32(1); + + /* left input gets recycled */ + bms = bms_offset_members(bms, offset); + + PG_RETURN_BITMAPSET_AS_TEXT(bms); +} + Datum test_bms_join(PG_FUNCTION_ARGS) { @@ -580,7 +594,7 @@ test_bitmap_match(PG_FUNCTION_ARGS) } /* - * Contrary to all the other functions which are one-one mappings with the + * Contrary to most of the other functions which are one-one mappings with the * equivalent C functions, this stresses Bitmapsets in a random fashion for * various operations. * @@ -742,3 +756,74 @@ test_random_operations(PG_FUNCTION_ARGS) PG_RETURN_INT32(total_ops); } + +/* + * Random testing for bms_offset_members(). Generates a random set and then + * picks a number to offset the members by. We then create another set, which + * is built by looping over the members of the random set and performing + * bms_add_member and adding on the offset to create a known good set to + * compare the result of bms_offset_members() to. + * + * Arguments: + * arg1: optional random seed, or < 0 to use a random seed. + * arg2: the number of operations to perform. + * arg3: the maximum bitmapset member number to use in the random set. + * arg4: the minimum bitmapset member number to use in the random set. + */ +Datum +test_random_offset_operations(PG_FUNCTION_ARGS) +{ + Bitmapset *bms1; + Bitmapset *bms2; + pg_prng_state state; + uint64 seed = GetCurrentTimestamp(); + int num_ops; + int max_range; + int min_value; + int member; + + if (PG_GETARG_INT32(0) > 0) + seed = PG_GETARG_INT32(0); + + num_ops = PG_GETARG_INT32(1); + max_range = PG_GETARG_INT32(2); + min_value = PG_GETARG_INT32(3); + + pg_prng_seed(&state, seed); + + for (int op = 0; op < num_ops; op++) + { + int offset; + + bms1 = NULL; + for (int i = 0; i < 10; i++) + { + member = pg_prng_uint32(&state) % max_range + min_value; + + bms1 = bms_add_member(bms1, member); + } + + offset = (pg_prng_uint32(&state) % max_range) - (pg_prng_uint32(&state) % max_range); + + /* create a known-good set the old fashioned way */ + bms2 = NULL; + member = -1; + while ((member = bms_next_member(bms1, member)) >= 0) + { + if (member + offset >= 0) + bms2 = bms_add_member(bms2, member + offset); + } + + /* do the offsetting */ + bms1 = bms_offset_members(bms1, offset); + + /* check against the known-good set */ + if (!bms_equal(bms1, bms2)) + elog(ERROR, "bms_offset_members failed with offset %d seed " INT64_FORMAT, offset, seed); + + bms_free(bms1); + bms_free(bms2); + } + + PG_RETURN_INT32(num_ops); +} -- 2.51.0 ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-14 08:46 Chao Li <[email protected]> parent: David Rowley <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Chao Li @ 2026-04-14 08:46 UTC (permalink / raw) To: David Rowley <[email protected]>; +Cc: PostgreSQL Developers <[email protected]> > On Apr 13, 2026, at 12:35, David Rowley <[email protected]> wrote: > > (v20 material) > > While working on some new code which required offsetting the members > of a Bitmapset, I decided to go and write a function to do this rather > than copy the various other places where we manually construct a new > set with a bms_next_member() -> bms_add_member() loop. The new use > case I have is from pulling varattnos from a scan's targetlist, which > there could be several hundred Vars in. I considered this might be > noticeably expensive. > > The current manual way we have of doing this is a bit laborious since > we're only doing 1 member per bms_next_member() loop, and also, if the > set has multiple words, we may end up doing several repallocs to > expand the set, perhaps as little as 1 word at a time. That's not to > mention the rather repetitive code that we have to do this in various > places that might be nice to consolidate. > > I've attached a patch which adds bms_offset_members(), which does > bitshifting to move the members up or down by the given offset. While > working on this I made a few choices which might be worth a revisit: > > 1) The function modifies the given set in-place rather than making a new set. > 2) The function will ERROR if any member would go above INT_MAX. These > would be inaccessible, and that seems weird/wrong. > 3) When offsetting by a negative value, members that would go below > zero fall out of the set silently. > > For #1; my original use-case that made me write this didn't need a > copy, so I wrote the function to modify the set in-place. After > hunting down and replacing the relevant existing bms_next_member() > loops with the new function, I noticed all these seem to need a copy. > Currently, I have coded the patch to do > bms_offset_members(bms_copy(set), ...), but that's a little > inefficient as it *could* result in a palloc for the copy then a > repalloc in the offset. If bms_offset_members() just created a new > set, then it could palloc() a set to the exact required size. The > counterargument to that is that I don't really want to copy the set > for my intended use case. I thought about two versions, but thought > that might be overkill. There could be a boolean parameter to define > that, but we don't do that anywhere else in bitmapset.c, or we could > avoid a copy-paste of the code with an always-inlined helper function. > I couldn't decide, so left it as-is. > > For #2, I could have equally made these fall off the top of the set, > but I thought we might want to know about it in the unlikely event > that this ever happens. > > #3 We commonly want to get rid of system columns from a > pull_varattnos() set which is offset by > FirstLowInvalidHeapAttributeNumber, so those disappearing silently is > what most use cases seem to want. I expect that's not for revisiting, > but I listed this one anyway as it flies in the face of #2. > > Patch attached. Comments welcome. > > David > <v1-0001-Introduce-bms_offset_members-function.patch> I basically agree with design choices 1/2/3. And the implementation of v1 overall looks good to me. The only issue I found is this overflow check: ``` + /* Handle a positive offset (bitshift left) */ + if (offset > 0) + { + /* + * An unlikely scenario, but check we're not going to create a member + * greater than PG_INT32_MAX. + */ + if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + high_bit + offset_bits > PG_INT32_MAX) + elog(ERROR, "bitmapset overflow"); ``` This overflow check seems wrong. Because when high_bit + offset_bits > BITS_PER_BITMAPWORD, new_nwords has been increased by 1, so there high_bit + offset_bits are double counted. To verify that, I added a new test: ``` -- 2147483583 is PG_INT32_MAX - 64, so offsetting by 1 should succeed, -- but offsetting it by 65 should fail with overflow error SELECT test_random_offset_operations_check_highest(2147483583, 1) AS result; SELECT test_random_offset_operations_check_highest(2147483583, 65) AS result; ``` With v1, test_random_offset_operations_check_highest(2147483583, 1) reports an overflow error, but it should not. Please see the attached diff for the test I added. In that diff, I also included a fix, and with that fix, the tests pass. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/ Attachments: [application/octet-stream] nocfbot_chao_test.diff (4.5K, 2-nocfbot_chao_test.diff) download | inline diff: diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c index 48514a3b781..50201780204 100644 --- a/src/backend/nodes/bitmapset.c +++ b/src/backend/nodes/bitmapset.c @@ -1010,6 +1010,7 @@ bms_offset_members(Bitmapset *a, int offset) int new_nwords; int old_nwords; int32 high_bit; + int64 old_highest; Assert(bms_is_valid_set(a)); @@ -1022,6 +1023,7 @@ bms_offset_members(Bitmapset *a, int offset) offset_bits = BITNUM(offset); new_nwords = a->nwords + offset_words; high_bit = bmw_leftmost_one_pos(a->words[a->nwords - 1]); + old_highest = ((int64) a->nwords - 1) * BITS_PER_BITMAPWORD + high_bit; /* Adjust the number of words in the new set to how many we'll need */ new_nwords += (offset_bits + high_bit) >= BITS_PER_BITMAPWORD; @@ -1041,7 +1043,8 @@ bms_offset_members(Bitmapset *a, int offset) * An unlikely scenario, but check we're not going to create a member * greater than PG_INT32_MAX. */ - if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + high_bit + offset_bits > PG_INT32_MAX) + /*if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + high_bit + offset_bits > PG_INT32_MAX)*/ + if (old_highest + offset > PG_INT32_MAX) elog(ERROR, "bitmapset overflow"); /* diff --git a/src/test/modules/test_bitmapset/expected/test_bitmapset.out b/src/test/modules/test_bitmapset/expected/test_bitmapset.out index 3a6f08427cd..984174cfaf7 100644 --- a/src/test/modules/test_bitmapset/expected/test_bitmapset.out +++ b/src/test/modules/test_bitmapset/expected/test_bitmapset.out @@ -1613,4 +1613,14 @@ SELECT test_random_offset_operations(-1, 10000, 1024, 0) AS result; 10000 (1 row) +-- 2147483583 is PG_INT32_MAX - 64, so offsetting by 1 should succeed, +-- but offsetting it by 65 should fail with overflow error +SELECT test_random_offset_operations_check_highest(2147483583, 1) AS result; + result +------------ + 2147483584 +(1 row) + +SELECT test_random_offset_operations_check_highest(2147483583, 65) AS result; +ERROR: bitmapset overflow DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql index 5c5cfe3fc15..072d264ad9a 100644 --- a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql +++ b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql @@ -416,4 +416,9 @@ SELECT test_random_operations(-1, 10000, 81920, 0) > 0 AS result; -- perform some random test on bms_offset_members() SELECT test_random_offset_operations(-1, 10000, 1024, 0) AS result; +-- 2147483583 is PG_INT32_MAX - 64, so offsetting by 1 should succeed, +-- but offsetting it by 65 should fail with overflow error +SELECT test_random_offset_operations_check_highest(2147483583, 1) AS result; +SELECT test_random_offset_operations_check_highest(2147483583, 65) AS result; + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql index ea12ec40057..88bf6719aa6 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql +++ b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql @@ -145,4 +145,8 @@ CREATE FUNCTION test_random_offset_operations(integer, integer, integer, integer RETURNS integer STRICT AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_random_offset_operations_check_highest(integer, integer) +RETURNS integer STRICT +AS 'MODULE_PATHNAME' LANGUAGE C; + COMMENT ON EXTENSION test_bitmapset IS 'Test code for Bitmapset'; diff --git a/src/test/modules/test_bitmapset/test_bitmapset.c b/src/test/modules/test_bitmapset/test_bitmapset.c index 1fbf9ffc200..59da3fb78ca 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset.c +++ b/src/test/modules/test_bitmapset/test_bitmapset.c @@ -67,6 +67,7 @@ PG_FUNCTION_INFO_V1(test_bitmap_match); /* Test utility functions */ PG_FUNCTION_INFO_V1(test_random_operations); PG_FUNCTION_INFO_V1(test_random_offset_operations); +PG_FUNCTION_INFO_V1(test_random_offset_operations_check_highest); /* Convenient macros to test results */ #define EXPECT_TRUE(expr) \ @@ -827,3 +828,20 @@ test_random_offset_operations(PG_FUNCTION_ARGS) PG_RETURN_INT32(num_ops); } + +Datum +test_random_offset_operations_check_highest(PG_FUNCTION_ARGS) +{ + int highest_member = 0; + int offset = 0; + Bitmapset *bms; + + highest_member = PG_GETARG_INT32(0); + offset = PG_GETARG_INT32(1); + + bms = bms_add_member(NULL, highest_member); + bms = bms_offset_members(bms, offset); + highest_member = bms_next_member(bms, -1); + + PG_RETURN_INT32(highest_member); +} \ No newline at end of file ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-14 23:45 David Rowley <[email protected]> parent: Chao Li <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: David Rowley @ 2026-04-14 23:45 UTC (permalink / raw) To: Chao Li <[email protected]>; +Cc: PostgreSQL Developers <[email protected]> Thanks for looking. On Tue, 14 Apr 2026 at 20:46, Chao Li <[email protected]> wrote: > + if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + high_bit + offset_bits > PG_INT32_MAX) > + elog(ERROR, "bitmapset overflow"); > This overflow check seems wrong. Because when high_bit + offset_bits > BITS_PER_BITMAPWORD, new_nwords has been increased by 1, so there high_bit + offset_bits are double counted. Your idea of checking the old highest member plus the offset seems a more robust method, so I've adjusted the patch to use that. David Attachments: [application/octet-stream] v2-0001-Introduce-bms_offset_members-function.patch (17.9K, 2-v2-0001-Introduce-bms_offset_members-function.patch) download | inline diff: From c6dc1ef00d597d13e79df60eab320dc35fee7336 Mon Sep 17 00:00:00 2001 From: David Rowley <[email protected]> Date: Fri, 20 Mar 2026 11:01:13 +1300 Subject: [PATCH v2] Introduce bms_offset_members() function Effectively a function to bitshift members by the specified number of bits. We have various fragments of code doing this manually with a bms_next_member() loop. We can do this more efficiently with bitshifting. --- src/backend/nodes/bitmapset.c | 160 ++++++++++++++++++ src/backend/optimizer/plan/setrefs.c | 10 +- src/backend/optimizer/prep/prepjointree.c | 28 +-- src/backend/rewrite/rewriteManip.c | 8 +- src/backend/statistics/extended_stats.c | 12 +- src/include/nodes/bitmapset.h | 1 + .../expected/test_bitmapset.out | 38 +++++ .../test_bitmapset/sql/test_bitmapset.sql | 13 ++ .../test_bitmapset/test_bitmapset--1.0.sql | 8 + .../modules/test_bitmapset/test_bitmapset.c | 87 +++++++++- 10 files changed, 326 insertions(+), 39 deletions(-) diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c index f053d8c4d64..458f8f69c51 100644 --- a/src/backend/nodes/bitmapset.c +++ b/src/backend/nodes/bitmapset.c @@ -991,6 +991,166 @@ bms_replace_members(Bitmapset *a, const Bitmapset *b) return a; } +/* + * bms_offset_members + * Adjust all existing members of the given set by adding 'offset' to + * each member. The offset value may be negative or positive. Members + * which would become negative as a result of a negative offset will be + * removed from the set, whereas too large an offset which would result + * in a member going > INT_MAX will result in an ERROR. + * + * The set is modified in-place when possible, but callers must assume the set + * can be reallocated and use the returned set. + */ +Bitmapset * +bms_offset_members(Bitmapset *a, int offset) +{ + int offset_words; + int offset_bits; + int new_nwords; + int old_nwords; + int32 high_bit; + + Assert(bms_is_valid_set(a)); + + /* Don't go to any effort if there's nothing to do */ + if (a == NULL) + return NULL; + + old_nwords = a->nwords; + offset_words = WORDNUM(offset); + offset_bits = BITNUM(offset); + new_nwords = a->nwords + offset_words; + high_bit = bmw_leftmost_one_pos(a->words[a->nwords - 1]); + + /* Adjust the number of words in the new set to how many we'll need */ + new_nwords += (offset_bits + high_bit) >= BITS_PER_BITMAPWORD; + new_nwords -= (offset_bits + high_bit) < 0; + + /* If we don't need any words then it must be an empty set */ + if (new_nwords <= 0) + { + pfree(a); + return NULL; + } + + /* Handle a positive offset (bitshift left) */ + if (offset > 0) + { + int old_highest = (old_nwords - 1) * BITS_PER_BITMAPWORD + high_bit; + + /* + * An unlikely scenario, but check we're not going to create a member + * greater than PG_INT32_MAX. + */ + if ((uint64) old_highest + offset > PG_INT32_MAX) + elog(ERROR, "bitmapset overflow"); + + /* + * Reallocate the memory for the set if we need more words, otherwise + * we reuse the existing set. + */ + if (new_nwords > a->nwords) + a = (Bitmapset *) repalloc0(a, BITMAPSET_SIZE(old_nwords), + BITMAPSET_SIZE(new_nwords)); + + a->nwords = new_nwords; + + /* Shift words up to higher array elements, if needed */ + if (offset_words > 0) + { + for (int i = old_nwords - 1; i >= 0; i--) + a->words[i + offset_words] = a->words[i]; + + /* zero out the old members from the lower, now unused words */ + for (int i = 0; i < offset_words; i++) + a->words[i] = 0; + } + + /* Now handle shifting up the bits, if needed */ + if (offset_bits > 0) + { + int carry_bits = BITS_PER_BITMAPWORD - offset_bits; + bitmapword prev_carry = 0; + + for (int i = offset_words; i < new_nwords; i++) + { + bitmapword carry = (a->words[i] >> carry_bits); + + /* shift bits up and carry the bits from the previous word */ + a->words[i] = (a->words[i] << offset_bits) | prev_carry; + prev_carry = carry; + } + } + } + + /* Handle negative offset (bitshift right) */ + else + { + /* Make the negative shift_words and shift_bits positive numbers */ + offset_words = 0 - offset_words; + offset_bits = 0 - offset_bits; + + /* + * Shrink the set to store the new number of words. We can't repalloc + * here as we need to still access the memory for the upper words to + * move them. In any case, repalloc with a smaller allocation likely + * is just going to return the same chunk, so doing so would just be + * needless overhead. + */ + a->nwords = new_nwords; + + /* Shift words down to lower elements, if needed */ + if (offset_words > 0) + { + for (int i = 0; i <= new_nwords; i++) + a->words[i] = (i + offset_words) >= old_nwords ? 0 : a->words[i + offset_words]; + + /* + * Don't bother with zeroing the upper words. They're no longer + * visible due to reducing nwords down to the new size. + */ + } + + /* Now handle shifting down the bits, if needed */ + if (offset_bits > 0) + { + int carry_bits = BITS_PER_BITMAPWORD - offset_bits; + bitmapword prev_carry = 0; + + if (new_nwords < old_nwords) + prev_carry = (a->words[new_nwords] << carry_bits); + + for (int i = new_nwords - 1; i >= 0; i--) + { + bitmapword carry = (a->words[i] << carry_bits); + + /* shift bits down and carry the bits from the previous word */ + a->words[i] = (a->words[i] >> offset_bits) | prev_carry; + prev_carry = carry; + } + } + } + +#ifdef REALLOCATE_BITMAPSETS + + /* + * There's no guarantee that the repalloc returned a new pointer, so copy + * and free unconditionally here. + */ + a = bms_copy_and_free(a); +#else + + /* + * That was a fair bit of bit twiddling, so make sure we've still got a + * valid set with no leading zero words + */ + Assert(bms_is_valid_set(a)); +#endif + + return a; +} + /* * bms_add_range * Add members in the range of 'lower' to 'upper' to the set. diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c index ff0e875f2a2..86a8d993997 100644 --- a/src/backend/optimizer/plan/setrefs.c +++ b/src/backend/optimizer/plan/setrefs.c @@ -2057,16 +2057,10 @@ set_hash_references(PlannerInfo *root, Plan *plan, int rtoffset) static Relids offset_relid_set(Relids relids, int rtoffset) { - Relids result = NULL; - int rtindex; - - /* If there's no offset to apply, we needn't recompute the value */ + /* If there's no offset to apply, we needn't make a copy */ if (rtoffset == 0) return relids; - rtindex = -1; - while ((rtindex = bms_next_member(relids, rtindex)) >= 0) - result = bms_add_member(result, rtindex + rtoffset); - return result; + return bms_offset_members(bms_copy(relids), rtoffset); } /* diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 95bf51606cc..072fd47db9e 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -3751,22 +3751,22 @@ has_notnull_forced_var(PlannerInfo *root, List *forced_null_vars, if (bms_is_member(varno, right_state->nullable_rels)) continue; - /* - * Iterate over attributes and adjust the bitmap indexes by - * FirstLowInvalidHeapAttributeNumber to get the actual attribute - * numbers. - */ - attno = -1; - while ((attno = bms_next_member(attrs, attno)) >= 0) - { - AttrNumber real_attno = attno + FirstLowInvalidHeapAttributeNumber; + /* Find the lowest member to check if system columns are present */ + attno = bms_next_member(attrs, -1); - /* system columns cannot be NULL */ - if (real_attno < 0) - return true; + /* We checked for an empty set above */ + Assert(attno >= 0); - forcednullattnums = bms_add_member(forcednullattnums, real_attno); - } + /* system columns cannot be NULL */ + if (attno + FirstLowInvalidHeapAttributeNumber < 0) + return true; + + /* + * Adjust the bitmap indexes by FirstLowInvalidHeapAttributeNumber to + * get the actual attribute numbers. + */ + forcednullattnums = bms_offset_members(bms_copy(attrs), + FirstLowInvalidHeapAttributeNumber); rte = rt_fetch(varno, root->parse->rtable); diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c index 4bf4aa0d6d1..e1017ddf42c 100644 --- a/src/backend/rewrite/rewriteManip.c +++ b/src/backend/rewrite/rewriteManip.c @@ -527,13 +527,7 @@ OffsetVarNodes(Node *node, int offset, int sublevels_up) static Relids offset_relid_set(Relids relids, int offset) { - Relids result = NULL; - int rtindex; - - rtindex = -1; - while ((rtindex = bms_next_member(relids, rtindex)) >= 0) - result = bms_add_member(result, rtindex + offset); - return result; + return bms_offset_members(bms_copy(relids), offset); } /* diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 2b83355d26e..f2c0e2d35d9 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1673,8 +1673,7 @@ statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid, */ if (!leakproof) { - Bitmapset *clause_attnums = NULL; - int attnum = -1; + Bitmapset *clause_attnums; /* * We have to check per-column privileges. *attnums has the attnums @@ -1685,13 +1684,8 @@ statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid, * while attnums within *attnums aren't. Convert *attnums to the * offset style so we can combine the results. */ - while ((attnum = bms_next_member(*attnums, attnum)) >= 0) - { - clause_attnums = - bms_add_member(clause_attnums, - attnum - FirstLowInvalidHeapAttributeNumber); - } - + clause_attnums = bms_offset_members(bms_copy(*attnums), + 0 - FirstLowInvalidHeapAttributeNumber); /* Now merge attnums from *exprs into clause_attnums */ if (*exprs != NIL) pull_varattnos((Node *) *exprs, relid, &clause_attnums); diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h index 067ec72e99b..748fa2ad58b 100644 --- a/src/include/nodes/bitmapset.h +++ b/src/include/nodes/bitmapset.h @@ -123,6 +123,7 @@ extern Bitmapset *bms_add_member(Bitmapset *a, int x); extern Bitmapset *bms_del_member(Bitmapset *a, int x); extern Bitmapset *bms_add_members(Bitmapset *a, const Bitmapset *b); extern Bitmapset *bms_replace_members(Bitmapset *a, const Bitmapset *b); +extern Bitmapset *bms_offset_members(Bitmapset *a, int offset); extern Bitmapset *bms_add_range(Bitmapset *a, int lower, int upper); extern Bitmapset *bms_int_members(Bitmapset *a, const Bitmapset *b); extern Bitmapset *bms_del_members(Bitmapset *a, const Bitmapset *b); diff --git a/src/test/modules/test_bitmapset/expected/test_bitmapset.out b/src/test/modules/test_bitmapset/expected/test_bitmapset.out index f75cb46b869..3a6f08427cd 100644 --- a/src/test/modules/test_bitmapset/expected/test_bitmapset.out +++ b/src/test/modules/test_bitmapset/expected/test_bitmapset.out @@ -760,6 +760,37 @@ SELECT test_bms_compare(NULL, NULL) AS result; 0 (1 row) +-- bms_offset_members() +-- Ensure overflow detection works +SELECT test_bms_offset_members('(b 1)', 2147483647); +ERROR: bitmapset overflow +SELECT test_bms_offset_members('(b 2)', 2147483646); +ERROR: bitmapset overflow +-- Ensure members are all offset as expected +SELECT test_bms_offset_members('(b 1 3 5)', 1); + test_bms_offset_members +------------------------- + (b 2 4 6) +(1 row) + +SELECT test_bms_offset_members('(b 1 3 5)', -1); + test_bms_offset_members +------------------------- + (b 0 2 4) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', 1); + test_bms_offset_members +------------------------- + (b 32 33 64 65) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', -1); + test_bms_offset_members +------------------------- + (b 30 31 62 63) +(1 row) + -- bms_add_range() SELECT test_bms_add_range('(b)', -5, 10); -- error ERROR: negative bitmapset member not allowed @@ -1575,4 +1606,11 @@ SELECT test_random_operations(-1, 10000, 81920, 0) > 0 AS result; t (1 row) +-- perform some random test on bms_offset_members() +SELECT test_random_offset_operations(-1, 10000, 1024, 0) AS result; + result +-------- + 10000 +(1 row) + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql index e75ab8f620a..5c5cfe3fc15 100644 --- a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql +++ b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql @@ -201,6 +201,16 @@ SELECT test_bms_compare('(b 5)', NULL) AS result; SELECT test_bms_compare(NULL, '(b 5)') AS result; SELECT test_bms_compare(NULL, NULL) AS result; +-- bms_offset_members() +-- Ensure overflow detection works +SELECT test_bms_offset_members('(b 1)', 2147483647); +SELECT test_bms_offset_members('(b 2)', 2147483646); +-- Ensure members are all offset as expected +SELECT test_bms_offset_members('(b 1 3 5)', 1); +SELECT test_bms_offset_members('(b 1 3 5)', -1); +SELECT test_bms_offset_members('(b 31 32 63 64)', 1); +SELECT test_bms_offset_members('(b 31 32 63 64)', -1); + -- bms_add_range() SELECT test_bms_add_range('(b)', -5, 10); -- error SELECT test_bms_add_range('(b)', 5, 7) AS result; @@ -403,4 +413,7 @@ SELECT test_bms_nonempty_difference('(b 1 2)', '(b 50 100)') AS result; -- random operations SELECT test_random_operations(-1, 10000, 81920, 0) > 0 AS result; +-- perform some random test on bms_offset_members() +SELECT test_random_offset_operations(-1, 10000, 1024, 0) AS result; + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql index 227ecb5aa3b..ea12ec40057 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql +++ b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql @@ -120,6 +120,10 @@ CREATE FUNCTION test_bms_replace_members(text, text) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_bms_offset_members(text, integer) +RETURNS text +AS 'MODULE_PATHNAME' LANGUAGE C; + CREATE FUNCTION test_bms_join(text, text) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C; @@ -137,4 +141,8 @@ CREATE FUNCTION test_random_operations(integer, integer, integer, integer) RETURNS integer STRICT AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_random_offset_operations(integer, integer, integer, integer) +RETURNS integer STRICT +AS 'MODULE_PATHNAME' LANGUAGE C; + COMMENT ON EXTENSION test_bitmapset IS 'Test code for Bitmapset'; diff --git a/src/test/modules/test_bitmapset/test_bitmapset.c b/src/test/modules/test_bitmapset/test_bitmapset.c index 272bed390a6..1fbf9ffc200 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset.c +++ b/src/test/modules/test_bitmapset/test_bitmapset.c @@ -59,12 +59,14 @@ PG_FUNCTION_INFO_V1(test_bms_add_members); PG_FUNCTION_INFO_V1(test_bms_int_members); PG_FUNCTION_INFO_V1(test_bms_del_members); PG_FUNCTION_INFO_V1(test_bms_replace_members); +PG_FUNCTION_INFO_V1(test_bms_offset_members); PG_FUNCTION_INFO_V1(test_bms_join); PG_FUNCTION_INFO_V1(test_bitmap_hash); PG_FUNCTION_INFO_V1(test_bitmap_match); /* Test utility functions */ PG_FUNCTION_INFO_V1(test_random_operations); +PG_FUNCTION_INFO_V1(test_random_offset_operations); /* Convenient macros to test results */ #define EXPECT_TRUE(expr) \ @@ -530,6 +532,18 @@ test_bms_replace_members(PG_FUNCTION_ARGS) PG_RETURN_BITMAPSET_AS_TEXT(bms1); } +Datum +test_bms_offset_members(PG_FUNCTION_ARGS) +{ + Bitmapset *bms = PG_ARG_GETBITMAPSET(0); + int offset = PG_GETARG_INT32(1); + + /* left input gets recycled */ + bms = bms_offset_members(bms, offset); + + PG_RETURN_BITMAPSET_AS_TEXT(bms); +} + Datum test_bms_join(PG_FUNCTION_ARGS) { @@ -580,7 +594,7 @@ test_bitmap_match(PG_FUNCTION_ARGS) } /* - * Contrary to all the other functions which are one-one mappings with the + * Contrary to most of the other functions which are one-one mappings with the * equivalent C functions, this stresses Bitmapsets in a random fashion for * various operations. * @@ -742,3 +756,74 @@ test_random_operations(PG_FUNCTION_ARGS) PG_RETURN_INT32(total_ops); } + +/* + * Random testing for bms_offset_members(). Generates a random set and then + * picks a number to offset the members by. We then create another set, which + * is built by looping over the members of the random set and performing + * bms_add_member and adding on the offset to create a known good set to + * compare the result of bms_offset_members() to. + * + * Arguments: + * arg1: optional random seed, or < 0 to use a random seed. + * arg2: the number of operations to perform. + * arg3: the maximum bitmapset member number to use in the random set. + * arg4: the minimum bitmapset member number to use in the random set. + */ +Datum +test_random_offset_operations(PG_FUNCTION_ARGS) +{ + Bitmapset *bms1; + Bitmapset *bms2; + pg_prng_state state; + uint64 seed = GetCurrentTimestamp(); + int num_ops; + int max_range; + int min_value; + int member; + + if (PG_GETARG_INT32(0) > 0) + seed = PG_GETARG_INT32(0); + + num_ops = PG_GETARG_INT32(1); + max_range = PG_GETARG_INT32(2); + min_value = PG_GETARG_INT32(3); + + pg_prng_seed(&state, seed); + + for (int op = 0; op < num_ops; op++) + { + int offset; + + bms1 = NULL; + for (int i = 0; i < 10; i++) + { + member = pg_prng_uint32(&state) % max_range + min_value; + + bms1 = bms_add_member(bms1, member); + } + + offset = (pg_prng_uint32(&state) % max_range) - (pg_prng_uint32(&state) % max_range); + + /* create a known-good set the old fashioned way */ + bms2 = NULL; + member = -1; + while ((member = bms_next_member(bms1, member)) >= 0) + { + if (member + offset >= 0) + bms2 = bms_add_member(bms2, member + offset); + } + + /* do the offsetting */ + bms1 = bms_offset_members(bms1, offset); + + /* check against the known-good set */ + if (!bms_equal(bms1, bms2)) + elog(ERROR, "bms_offset_members failed with offset %d seed " INT64_FORMAT, offset, seed); + + bms_free(bms1); + bms_free(bms2); + } + + PG_RETURN_INT32(num_ops); +} -- 2.51.0 ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-15 00:29 Tom Lane <[email protected]> parent: David Rowley <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Tom Lane @ 2026-04-15 00:29 UTC (permalink / raw) To: David Rowley <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]> David Rowley <[email protected]> writes: > Your idea of checking the old highest member plus the offset seems a > more robust method, so I've adjusted the patch to use that. I question the decision to make this change the set in-place. Wouldn't it be cheaper and less surprise-prone to always make a copy? regards, tom lane ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-15 02:21 David Rowley <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: David Rowley @ 2026-04-15 02:21 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]> On Wed, 15 Apr 2026 at 12:29, Tom Lane <[email protected]> wrote: > I question the decision to make this change the set in-place. > Wouldn't it be cheaper and less surprise-prone to always make > a copy? I'd not considered surprise-prone as an aspect. I understand we have bms_join and bms_union, which do the same thing if you only care about the value of the result and not what happens to the inputs. So I didn't think I was introducing anything too surpising given we've got various other Bitmapset functions that modify the input in-place. My expectation there was that people are used to checking what the behaviour of the bitmapset function is. For the current use cases of the function in the patch, I agree that it would likely be better for performance if the new function allocated a new set. It was more a question of whether we want to throw away performance for other cases which are fine with an in-place update and have a positive offset. Those will never repalloc(). I didn't really know the answer. I suspect with the current patch that each of the existing cases will be faster than today's bms_next_member loops, regardless. When I wrote the function, I was mainly thinking of the new use-case that I was working on, which won't require any palloc() or repalloc() with the current design. Since I was adding that to a fairly common code path, I thought I had more of a chance of not having to jump through too many hoops to ensure I don't cause any performance regressions. In short, I don't really know what's best. I'm certainly open to changing it if someone comes up with a good reason to do it the other way. Maybe catering for the majority of callers is a good enough reason to change it. David ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-15 02:30 Tom Lane <[email protected]> parent: David Rowley <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Tom Lane @ 2026-04-15 02:30 UTC (permalink / raw) To: David Rowley <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]> David Rowley <[email protected]> writes: > On Wed, 15 Apr 2026 at 12:29, Tom Lane <[email protected]> wrote: >> I question the decision to make this change the set in-place. >> Wouldn't it be cheaper and less surprise-prone to always make >> a copy? > I'd not considered surprise-prone as an aspect. I understand we have > bms_join and bms_union, which do the same thing if you only care about > the value of the result and not what happens to the inputs. Sure, but bms_join is an optional optimization of the far safer bms_union operation. It bothers me to create the optimized case but not the base case. regards, tom lane ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-15 02:33 David Rowley <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 2 replies; 15+ messages in thread From: David Rowley @ 2026-04-15 02:33 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]> On Wed, 15 Apr 2026 at 14:30, Tom Lane <[email protected]> wrote: > > David Rowley <[email protected]> writes: > > I'd not considered surprise-prone as an aspect. I understand we have > > bms_join and bms_union, which do the same thing if you only care about > > the value of the result and not what happens to the inputs. > > Sure, but bms_join is an optional optimization of the far safer > bms_union operation. It bothers me to create the optimized case > but not the base case. Hmm, yeah. That seems like a good argument for making a new set. I'll go make it so. David ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-15 19:17 Peter Eisentraut <[email protected]> parent: David Rowley <[email protected]> 1 sibling, 1 reply; 15+ messages in thread From: Peter Eisentraut @ 2026-04-15 19:17 UTC (permalink / raw) To: David Rowley <[email protected]>; Tom Lane <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]> On 15.04.26 04:33, David Rowley wrote: > On Wed, 15 Apr 2026 at 14:30, Tom Lane <[email protected]> wrote: >> >> David Rowley <[email protected]> writes: >>> I'd not considered surprise-prone as an aspect. I understand we have >>> bms_join and bms_union, which do the same thing if you only care about >>> the value of the result and not what happens to the inputs. >> >> Sure, but bms_join is an optional optimization of the far safer >> bms_union operation. It bothers me to create the optimized case >> but not the base case. > > Hmm, yeah. That seems like a good argument for making a new set. I'll > go make it so. Depending on what you end up doing, maybe a sprinkling of pg_nodiscard could be appropriate. ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-18 07:49 David Rowley <[email protected]> parent: David Rowley <[email protected]> 1 sibling, 1 reply; 15+ messages in thread From: David Rowley @ 2026-04-18 07:49 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]> On Wed, 15 Apr 2026 at 14:33, David Rowley <[email protected]> wrote: > > On Wed, 15 Apr 2026 at 14:30, Tom Lane <[email protected]> wrote: > > > > David Rowley <[email protected]> writes: > > > I'd not considered surprise-prone as an aspect. I understand we have > > > bms_join and bms_union, which do the same thing if you only care about > > > the value of the result and not what happens to the inputs. > > > > Sure, but bms_join is an optional optimization of the far safer > > bms_union operation. It bothers me to create the optimized case > > but not the base case. > > Hmm, yeah. That seems like a good argument for making a new set. I'll > go make it so. Patch attached for the version that creates a new set rather than modifying the input set in-place. David Attachments: [application/octet-stream] v2-0001-Introduce-bms_offset_members-function.patch (20.4K, 2-v2-0001-Introduce-bms_offset_members-function.patch) download | inline diff: From f65629a86d25f1247c91b1a2fbbba7869c51192a Mon Sep 17 00:00:00 2001 From: David Rowley <[email protected]> Date: Fri, 20 Mar 2026 11:01:13 +1300 Subject: [PATCH v2] Introduce bms_offset_members() function Effectively a function to bitshift members by the specified number of bits. We have various fragments of code doing this manually with a bms_next_member() loop. We can do this more efficiently with bitshifting. --- src/backend/nodes/bitmapset.c | 136 ++++++++++++++++++ src/backend/optimizer/plan/setrefs.c | 10 +- src/backend/optimizer/prep/prepjointree.c | 30 ++-- src/backend/rewrite/rewriteManip.c | 25 +--- src/backend/statistics/extended_stats.c | 12 +- src/include/nodes/bitmapset.h | 1 + .../expected/test_bitmapset.out | 50 +++++++ .../test_bitmapset/sql/test_bitmapset.sql | 15 ++ .../test_bitmapset/test_bitmapset--1.0.sql | 8 ++ .../modules/test_bitmapset/test_bitmapset.c | 105 +++++++++++++- 10 files changed, 339 insertions(+), 53 deletions(-) diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c index f053d8c4d64..f3f87f11cfd 100644 --- a/src/backend/nodes/bitmapset.c +++ b/src/backend/nodes/bitmapset.c @@ -39,6 +39,7 @@ #include "postgres.h" #include "common/hashfn.h" +#include "common/int.h" #include "nodes/bitmapset.h" #include "nodes/pg_list.h" #include "port/pg_bitutils.h" @@ -405,6 +406,140 @@ bms_difference(const Bitmapset *a, const Bitmapset *b) return result; } +/* + * bms_offset_members + * Creates a new Bitmapset with all members of 'a' adjusted to add the + * value of 'offset' to each member. + * + * Members which would become negative as a result of a negative offset will + * be removed from the set, whereas too large an offset, which would result in + * a member going > INT_MAX, will result in an ERROR. + */ +Bitmapset * +bms_offset_members(const Bitmapset *a, int offset) +{ + Bitmapset *result; + int offset_words; + int offset_bits; + int new_nwords; + int old_nwords; + int32 high_bit; + int old_highest; + int new_highest; + + Assert(bms_is_valid_set(a)); + + /* nothing to do for empty sets */ + if (a == NULL) + return NULL; + + old_nwords = a->nwords; + offset_words = WORDNUM(offset); + offset_bits = BITNUM(offset); + high_bit = bmw_leftmost_one_pos(a->words[a->nwords - 1]); + old_highest = (old_nwords - 1) * BITS_PER_BITMAPWORD + high_bit; + + /* don't create a set with a member that doesn't fit into an int32 */ + if (pg_add_s32_overflow(old_highest, offset, &new_highest)) + elog(ERROR, "bitmapset overflow"); + /* return NULL if the new set would be empty */ + else if (new_highest < 0) + return NULL; + + new_nwords = WORDNUM(new_highest) + 1; + result = (Bitmapset *) palloc0(BITMAPSET_SIZE(new_nwords)); + result->type = T_Bitmapset; + result->nwords = new_nwords; + + /* handle zero and positive offsets (bitshift left) */ + if (offset >= 0) + { + /* + * We special-case offsetting only by whole words so we don't have to + * special-case bitshifting by BITS_PER_BITMAPWORD places, which has + * an undefined behavior. + */ + if (offset_bits == 0) + { + int i = 0; + + /* + * The old set is guaranteed to have at least 1 word, so use + * do/while to save the redundant initial loop bounds check. + */ + do + { + Assert(i + offset_words < new_nwords); + result->words[i + offset_words] = a->words[i]; + } while (++i < old_nwords); + } + else + { + int carry_bits = BITS_PER_BITMAPWORD - offset_bits; + bitmapword prev_carry = 0; + int i = 0; + + do + { + bitmapword carry = (a->words[i] >> carry_bits); + + Assert(i + offset_words < new_nwords); + /* shift bits up and carry bits from the previous word */ + result->words[i + offset_words] = (a->words[i] << offset_bits) | prev_carry; + prev_carry = carry; + } while (++i < old_nwords); + result->words[new_nwords - 1] |= prev_carry; + } + } + + /* handle negative offset (bitshift right) */ + else + { + /* make the negative offset_words and offset_bits positive */ + offset_words = 0 - offset_words; + offset_bits = 0 - offset_bits; + + /* as above, special case shifting only by whole words */ + if (offset_bits == 0) + { + int i = 0; + + do + { + Assert(i + offset_words < old_nwords); + result->words[i] = a->words[i + offset_words]; + } while (++i < new_nwords); + } + else + { + int carry_bits = BITS_PER_BITMAPWORD - offset_bits; + bitmapword prev_carry = 0; + int i = new_nwords - 1; + + /* carry bits from any word just above where the loop starts */ + if (old_nwords > new_nwords + offset_words) + prev_carry = (a->words[new_nwords + offset_words] << carry_bits); + + /* + * We loop backward over the array so we correctly carry bits from + * higher words. + */ + do + { + bitmapword carry = (a->words[i + offset_words] << carry_bits); + + Assert(i + offset_words < old_nwords); + + /* shift bits down and carry bits from the previous word */ + result->words[i] = (a->words[i + offset_words] >> offset_bits) | prev_carry; + prev_carry = carry; + } while (--i >= 0); + } + } + + return result; +} + /* * bms_is_subset - is A a subset of B? */ @@ -991,6 +1126,7 @@ bms_replace_members(Bitmapset *a, const Bitmapset *b) return a; } + /* * bms_add_range * Add members in the range of 'lower' to 'upper' to the set. diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c index ff0e875f2a2..ebae7a8dea9 100644 --- a/src/backend/optimizer/plan/setrefs.c +++ b/src/backend/optimizer/plan/setrefs.c @@ -2057,16 +2057,10 @@ set_hash_references(PlannerInfo *root, Plan *plan, int rtoffset) static Relids offset_relid_set(Relids relids, int rtoffset) { - Relids result = NULL; - int rtindex; - - /* If there's no offset to apply, we needn't recompute the value */ + /* If there's no offset to apply, we needn't make another set */ if (rtoffset == 0) return relids; - rtindex = -1; - while ((rtindex = bms_next_member(relids, rtindex)) >= 0) - result = bms_add_member(result, rtindex + rtoffset); - return result; + return bms_offset_members(relids, rtoffset); } /* diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 95bf51606cc..78281262ba1 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -3731,7 +3731,7 @@ has_notnull_forced_var(PlannerInfo *root, List *forced_null_vars, RangeTblEntry *rte; Bitmapset *notnullattnums; Bitmapset *forcednullattnums = NULL; - int attno; + int lowest_attno; varno++; @@ -3751,22 +3751,22 @@ has_notnull_forced_var(PlannerInfo *root, List *forced_null_vars, if (bms_is_member(varno, right_state->nullable_rels)) continue; - /* - * Iterate over attributes and adjust the bitmap indexes by - * FirstLowInvalidHeapAttributeNumber to get the actual attribute - * numbers. - */ - attno = -1; - while ((attno = bms_next_member(attrs, attno)) >= 0) - { - AttrNumber real_attno = attno + FirstLowInvalidHeapAttributeNumber; + /* find the lowest member to check if system columns are present */ + lowest_attno = bms_next_member(attrs, -1); - /* system columns cannot be NULL */ - if (real_attno < 0) - return true; + /* we checked for an empty set above */ + Assert(lowest_attno >= 0); - forcednullattnums = bms_add_member(forcednullattnums, real_attno); - } + /* system columns cannot be NULL */ + if (lowest_attno + FirstLowInvalidHeapAttributeNumber < 0) + return true; + + /* + * Offset the bitmap members by FirstLowInvalidHeapAttributeNumber to + * get the actual attribute numbers. + */ + forcednullattnums = bms_offset_members(attrs, + FirstLowInvalidHeapAttributeNumber); rte = rt_fetch(varno, root->parse->rtable); diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c index 4bf4aa0d6d1..6137f4303f2 100644 --- a/src/backend/rewrite/rewriteManip.c +++ b/src/backend/rewrite/rewriteManip.c @@ -64,7 +64,6 @@ static bool contain_windowfuncs_walker(Node *node, void *context); static bool locate_windowfunc_walker(Node *node, locate_windowfunc_context *context); static bool checkExprHasSubLink_walker(Node *node, void *context); -static Relids offset_relid_set(Relids relids, int offset); static Node *add_nulling_relids_mutator(Node *node, add_nulling_relids_context *context); static Node *remove_nulling_relids_mutator(Node *node, @@ -397,8 +396,8 @@ OffsetVarNodes_walker(Node *node, OffsetVarNodes_context *context) if (var->varlevelsup == context->sublevels_up) { var->varno += context->offset; - var->varnullingrels = offset_relid_set(var->varnullingrels, - context->offset); + var->varnullingrels = bms_offset_members(var->varnullingrels, + context->offset); if (var->varnosyn > 0) var->varnosyn += context->offset; } @@ -435,10 +434,10 @@ OffsetVarNodes_walker(Node *node, OffsetVarNodes_context *context) if (phv->phlevelsup == context->sublevels_up) { - phv->phrels = offset_relid_set(phv->phrels, - context->offset); - phv->phnullingrels = offset_relid_set(phv->phnullingrels, - context->offset); + phv->phrels = bms_offset_members(phv->phrels, + context->offset); + phv->phnullingrels = bms_offset_members(phv->phnullingrels, + context->offset); } /* fall through to examine children */ } @@ -524,18 +523,6 @@ OffsetVarNodes(Node *node, int offset, int sublevels_up) OffsetVarNodes_walker(node, &context); } -static Relids -offset_relid_set(Relids relids, int offset) -{ - Relids result = NULL; - int rtindex; - - rtindex = -1; - while ((rtindex = bms_next_member(relids, rtindex)) >= 0) - result = bms_add_member(result, rtindex + offset); - return result; -} - /* * ChangeVarNodes - adjust Var nodes for a specific change of RT index * diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 2b83355d26e..332e7423bd8 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1673,8 +1673,7 @@ statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid, */ if (!leakproof) { - Bitmapset *clause_attnums = NULL; - int attnum = -1; + Bitmapset *clause_attnums; /* * We have to check per-column privileges. *attnums has the attnums @@ -1685,13 +1684,8 @@ statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid, * while attnums within *attnums aren't. Convert *attnums to the * offset style so we can combine the results. */ - while ((attnum = bms_next_member(*attnums, attnum)) >= 0) - { - clause_attnums = - bms_add_member(clause_attnums, - attnum - FirstLowInvalidHeapAttributeNumber); - } - + clause_attnums = bms_offset_members(*attnums, + 0 - FirstLowInvalidHeapAttributeNumber); /* Now merge attnums from *exprs into clause_attnums */ if (*exprs != NIL) pull_varattnos((Node *) *exprs, relid, &clause_attnums); diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h index 067ec72e99b..997f8a1cd96 100644 --- a/src/include/nodes/bitmapset.h +++ b/src/include/nodes/bitmapset.h @@ -100,6 +100,7 @@ extern void bms_free(Bitmapset *a); extern Bitmapset *bms_union(const Bitmapset *a, const Bitmapset *b); extern Bitmapset *bms_intersect(const Bitmapset *a, const Bitmapset *b); extern Bitmapset *bms_difference(const Bitmapset *a, const Bitmapset *b); +extern Bitmapset *bms_offset_members(const Bitmapset *a, int offset); extern bool bms_is_subset(const Bitmapset *a, const Bitmapset *b); extern BMS_Comparison bms_subset_compare(const Bitmapset *a, const Bitmapset *b); extern bool bms_is_member(int x, const Bitmapset *a); diff --git a/src/test/modules/test_bitmapset/expected/test_bitmapset.out b/src/test/modules/test_bitmapset/expected/test_bitmapset.out index f75cb46b869..568b97bb635 100644 --- a/src/test/modules/test_bitmapset/expected/test_bitmapset.out +++ b/src/test/modules/test_bitmapset/expected/test_bitmapset.out @@ -528,6 +528,49 @@ SELECT test_bms_difference(NULL, NULL) AS result; <> (1 row) +-- bms_offset_members() +-- Ensure overflow detection works +SELECT test_bms_offset_members('(b 1)', 2147483647); +ERROR: bitmapset overflow +SELECT test_bms_offset_members('(b 2)', 2147483646); +ERROR: bitmapset overflow +-- Ensure members are all offset as expected +SELECT test_bms_offset_members('(b 1 3 5)', 1); + test_bms_offset_members +------------------------- + (b 2 4 6) +(1 row) + +SELECT test_bms_offset_members('(b 1 3 5)', 0); + test_bms_offset_members +------------------------- + (b 1 3 5) +(1 row) + +SELECT test_bms_offset_members('(b 1 3 5)', -1); + test_bms_offset_members +------------------------- + (b 0 2 4) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', 1); + test_bms_offset_members +------------------------- + (b 32 33 64 65) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', 0); + test_bms_offset_members +------------------------- + (b 31 32 63 64) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', -1); + test_bms_offset_members +------------------------- + (b 30 31 62 63) +(1 row) + -- bms_is_member() SELECT test_bms_is_member('(b)', -5); -- error ERROR: negative bitmapset member not allowed @@ -1575,4 +1618,11 @@ SELECT test_random_operations(-1, 10000, 81920, 0) > 0 AS result; t (1 row) +-- perform some random test on bms_offset_members() +SELECT test_random_offset_operations(NULL, 10000, 1024, 0) AS result; + result +-------- + 10000 +(1 row) + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql index e75ab8f620a..9b852d6d990 100644 --- a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql +++ b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql @@ -145,6 +145,18 @@ SELECT test_bms_difference('(b 5)', NULL) AS result; SELECT test_bms_difference(NULL, '(b 5)') AS result; SELECT test_bms_difference(NULL, NULL) AS result; +-- bms_offset_members() +-- Ensure overflow detection works +SELECT test_bms_offset_members('(b 1)', 2147483647); +SELECT test_bms_offset_members('(b 2)', 2147483646); +-- Ensure members are all offset as expected +SELECT test_bms_offset_members('(b 1 3 5)', 1); +SELECT test_bms_offset_members('(b 1 3 5)', 0); +SELECT test_bms_offset_members('(b 1 3 5)', -1); +SELECT test_bms_offset_members('(b 31 32 63 64)', 1); +SELECT test_bms_offset_members('(b 31 32 63 64)', 0); +SELECT test_bms_offset_members('(b 31 32 63 64)', -1); + -- bms_is_member() SELECT test_bms_is_member('(b)', -5); -- error SELECT test_bms_is_member('(b 1 3 5)', 1) AS result; @@ -403,4 +415,7 @@ SELECT test_bms_nonempty_difference('(b 1 2)', '(b 50 100)') AS result; -- random operations SELECT test_random_operations(-1, 10000, 81920, 0) > 0 AS result; +-- perform some random test on bms_offset_members() +SELECT test_random_offset_operations(NULL, 10000, 1024, 0) AS result; + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql index 227ecb5aa3b..ee58319e4d9 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql +++ b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql @@ -56,6 +56,10 @@ CREATE FUNCTION test_bms_difference(text, text) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_bms_offset_members(text, integer) +RETURNS text +AS 'MODULE_PATHNAME' LANGUAGE C; + CREATE FUNCTION test_bms_is_empty(text) RETURNS boolean AS 'MODULE_PATHNAME' LANGUAGE C; @@ -137,4 +141,8 @@ CREATE FUNCTION test_random_operations(integer, integer, integer, integer) RETURNS integer STRICT AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_random_offset_operations(bigint, integer, integer, integer) +RETURNS integer +AS 'MODULE_PATHNAME' LANGUAGE C; + COMMENT ON EXTENSION test_bitmapset IS 'Test code for Bitmapset'; diff --git a/src/test/modules/test_bitmapset/test_bitmapset.c b/src/test/modules/test_bitmapset/test_bitmapset.c index 272bed390a6..c134df97cd5 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset.c +++ b/src/test/modules/test_bitmapset/test_bitmapset.c @@ -19,11 +19,12 @@ #include <stddef.h> #include "catalog/pg_type.h" #include "common/pg_prng.h" -#include "utils/array.h" #include "fmgr.h" +#include "miscadmin.h" #include "nodes/bitmapset.h" #include "nodes/nodes.h" #include "nodes/pg_list.h" +#include "utils/array.h" #include "utils/builtins.h" #include "utils/timestamp.h" @@ -43,6 +44,7 @@ PG_FUNCTION_INFO_V1(test_bms_subset_compare); PG_FUNCTION_INFO_V1(test_bms_union); PG_FUNCTION_INFO_V1(test_bms_intersect); PG_FUNCTION_INFO_V1(test_bms_difference); +PG_FUNCTION_INFO_V1(test_bms_offset_members); PG_FUNCTION_INFO_V1(test_bms_is_empty); PG_FUNCTION_INFO_V1(test_bms_membership); PG_FUNCTION_INFO_V1(test_bms_singleton_member); @@ -65,6 +67,7 @@ PG_FUNCTION_INFO_V1(test_bitmap_match); /* Test utility functions */ PG_FUNCTION_INFO_V1(test_random_operations); +PG_FUNCTION_INFO_V1(test_random_offset_operations); /* Convenient macros to test results */ #define EXPECT_TRUE(expr) \ @@ -295,6 +298,18 @@ test_bms_difference(PG_FUNCTION_ARGS) PG_RETURN_BITMAPSET_AS_TEXT(result_bms); } +Datum +test_bms_offset_members(PG_FUNCTION_ARGS) +{ + Bitmapset *bms = PG_ARG_GETBITMAPSET(0); + int offset = PG_GETARG_INT32(1); + + /* left input gets recycled */ + bms = bms_offset_members(bms, offset); + + PG_RETURN_BITMAPSET_AS_TEXT(bms); +} + Datum test_bms_compare(PG_FUNCTION_ARGS) { @@ -580,7 +595,7 @@ test_bitmap_match(PG_FUNCTION_ARGS) } /* - * Contrary to all the other functions which are one-one mappings with the + * Contrary to most of the other functions which are one-one mappings with the * equivalent C functions, this stresses Bitmapsets in a random fashion for * various operations. * @@ -742,3 +757,89 @@ test_random_operations(PG_FUNCTION_ARGS) PG_RETURN_INT32(total_ops); } + +/* + * Random testing for bms_offset_members(). Generates a random set and then + * picks a number to offset the members by. We then create another set, which + * is built by looping over the members of the random set and performing + * bms_add_member and adding on the offset to create a known good set to + * compare the result of bms_offset_members() to. + * + * Arguments: + * arg1: optional random seed, or < 0 to use a random seed. + * arg2: the number of operations to perform. + * arg3: the maximum bitmapset member number to use in the random set. + * arg4: the minimum bitmapset member number to use in the random set. + */ +Datum +test_random_offset_operations(PG_FUNCTION_ARGS) +{ + pg_prng_state state; + uint64 seed; + int num_ops; + int max_range; + int min_value; + int member; + + if (PG_ARGISNULL(0)) + seed = GetCurrentTimestamp(); + else + seed = PG_GETARG_INT64(0); + + num_ops = PG_GETARG_INT32(1); + max_range = PG_GETARG_INT32(2); + min_value = PG_GETARG_INT32(3); + + if (PG_ARGISNULL(1) || num_ops <= 0) + elog(ERROR, "invalid number of operations"); + if (PG_ARGISNULL(2) || max_range <= 0) + elog(ERROR, "invalid maximum range"); + if (PG_ARGISNULL(3) || min_value < 0) + elog(ERROR, "invalid minimum value"); + + pg_prng_seed(&state, seed); + + for (int op = 0; op < num_ops; op++) + { + Bitmapset *random_bms = NULL; + Bitmapset *offset_bms1; + Bitmapset *offset_bms2 = NULL; + int offset; + int nmembers; + + CHECK_FOR_INTERRUPTS(); + + /* Figure out a random offset and how many members to add */ + offset = (pg_prng_uint32(&state) % max_range) - (pg_prng_uint32(&state) % max_range); + nmembers = pg_prng_uint32(&state) % max_range + min_value; + + for (int i = 0; i < nmembers; i++) + { + member = pg_prng_uint32(&state) % max_range + min_value; + random_bms = bms_add_member(random_bms, member); + } + + /* create a known-good set the old fashioned way */ + offset_bms2 = NULL; + member = -1; + while ((member = bms_next_member(random_bms, member)) >= 0) + { + if (member + offset >= 0) + offset_bms2 = bms_add_member(offset_bms2, member + offset); + } + + /* do the offsetting */ + offset_bms1 = bms_offset_members(random_bms, offset); + + /* check against the known-good set */ + if (!bms_equal(offset_bms1, offset_bms2)) + elog(ERROR, "bms_offset_members failed with offset %d seed " INT64_FORMAT, offset, seed); + + /* Cleanup before the next loop */ + bms_free(random_bms); + bms_free(offset_bms1); + bms_free(offset_bms2); + } + + PG_RETURN_INT32(num_ops); +} -- 2.51.0 ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-18 07:53 David Rowley <[email protected]> parent: Peter Eisentraut <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: David Rowley @ 2026-04-18 07:53 UTC (permalink / raw) To: Peter Eisentraut <[email protected]>; +Cc: Tom Lane <[email protected]>; Chao Li <[email protected]>; PostgreSQL Developers <[email protected]> On Thu, 16 Apr 2026 at 07:17, Peter Eisentraut <[email protected]> wrote: > Depending on what you end up doing, maybe a sprinkling of pg_nodiscard > could be appropriate. Yeah maybe. It wouldn't do any harm, at least. I didn't do that in the patch I just sent as it felt like something we should do or not do for all the bitmapset functions it's relevant for. REALLOCATE_BITMAPSETS is meant to give us a stronger guarantee of people forgetting to do this, as it would cause a breakage if there were multiple pointers to the same set and only one of them got updated. David ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-18 15:09 Tom Lane <[email protected]> parent: David Rowley <[email protected]> 0 siblings, 0 replies; 15+ messages in thread From: Tom Lane @ 2026-04-18 15:09 UTC (permalink / raw) To: David Rowley <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; Chao Li <[email protected]>; PostgreSQL Developers <[email protected]> David Rowley <[email protected]> writes: > On Thu, 16 Apr 2026 at 07:17, Peter Eisentraut <[email protected]> wrote: >> Depending on what you end up doing, maybe a sprinkling of pg_nodiscard >> could be appropriate. > Yeah maybe. It wouldn't do any harm, at least. > I didn't do that in the patch I just sent as it felt like something we > should do or not do for all the bitmapset functions it's relevant for. Agreed, seems like it should be a separate patch. regards, tom lane ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-19 19:21 Greg Burd <[email protected]> parent: David Rowley <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Greg Burd @ 2026-04-19 19:21 UTC (permalink / raw) To: David Rowley <[email protected]>; Tom Lane <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Hackers <[email protected]> On Sat, Apr 18, 2026, at 3:49 AM, David Rowley wrote: > On Wed, 15 Apr 2026 at 14:33, David Rowley <[email protected]> wrote: >> >> On Wed, 15 Apr 2026 at 14:30, Tom Lane <[email protected]> wrote: >> > >> > David Rowley <[email protected]> writes: >> > > I'd not considered surprise-prone as an aspect. I understand we have >> > > bms_join and bms_union, which do the same thing if you only care about >> > > the value of the result and not what happens to the inputs. >> > >> > Sure, but bms_join is an optional optimization of the far safer >> > bms_union operation. It bothers me to create the optimized case >> > but not the base case. >> >> Hmm, yeah. That seems like a good argument for making a new set. I'll >> go make it so. > > Patch attached for the version that creates a new set rather than > modifying the input set in-place. > > David Hey David, > Attachments: > * v2-0001-Introduce-bms_offset_members-function.patch I applied, tested, and reviewed these changes. Thanks for doing this, only a few small things jumped out. nit: in bitmapset.c there is a new line added above bms_add_range() + * Arguments: + * arg1: optional random seed, or < 0 to use a random seed. + * arg2: the number of operations to perform. + * arg3: the maximum bitmapset member number to use in the random set. + * arg4: the minimum bitmapset member number to use in the random set. nit: whitespace ahead of arg1, also should be "NULL" not "< 0" in test_bitmapset.sql +-- perform some random test on bms_offset_members() nit: "tests" Also, I think the random testing will likely cover these, but here are a few more explicit tests for odd corner cases. -- shift that shrinks nwords SELECT test_bms_offset_members('(b 64 65 66)', -64); -- drops into word 0 -- shift that drops some low members and keeps others SELECT test_bms_offset_members('(b 0 1 2 10)', -2); -- expect (b 0 8) -- entire set shifts below zero -> empty SELECT test_bms_offset_members('(b 1 2 3)', -10); -- expect empty -- word-aligned positive and negative shifts SELECT test_bms_offset_members('(b 1 2 3)', 64); SELECT test_bms_offset_members('(b 65 66 67)', -64); -- INT_MIN boundary SELECT test_bms_offset_members('(b 1)', -2147483648); I like the functionality and the reduction of repeated code that you've identified and fixed. best. -greg ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-19 23:52 David Rowley <[email protected]> parent: Greg Burd <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: David Rowley @ 2026-04-19 23:52 UTC (permalink / raw) To: Greg Burd <[email protected]>; +Cc: Tom Lane <[email protected]>; Chao Li <[email protected]>; PostgreSQL Hackers <[email protected]> On Mon, 20 Apr 2026 at 07:22, Greg Burd <[email protected]> wrote: > I applied, tested, and reviewed these changes. Thanks for doing this, only a few small things jumped out. Many thanks. I took all of those suggestions. > SELECT test_bms_offset_members('(b 1)', -2147483648); I made that one use member 0 instead of 1. That'll mean "new_highest" goes to INT_MIN rather than INT_MIN + 1. David Attachments: [application/octet-stream] v3-0001-Introduce-bms_offset_members-function.patch (20.9K, 2-v3-0001-Introduce-bms_offset_members-function.patch) download | inline diff: From 0957ccce60a8c10c43f6ec8deb8011fe3b9e9722 Mon Sep 17 00:00:00 2001 From: David Rowley <[email protected]> Date: Fri, 20 Mar 2026 11:01:13 +1300 Subject: [PATCH v3] Introduce bms_offset_members() function Effectively a function to bitshift members by the specified number of bits. We have various fragments of code doing this manually with a bms_next_member() loop. We can do this more efficiently with bitshifting. --- src/backend/nodes/bitmapset.c | 135 ++++++++++++++++++ src/backend/optimizer/plan/setrefs.c | 10 +- src/backend/optimizer/prep/prepjointree.c | 30 ++-- src/backend/rewrite/rewriteManip.c | 25 +--- src/backend/statistics/extended_stats.c | 12 +- src/include/nodes/bitmapset.h | 1 + .../expected/test_bitmapset.out | 81 +++++++++++ .../test_bitmapset/sql/test_bitmapset.sql | 23 +++ .../test_bitmapset/test_bitmapset--1.0.sql | 8 ++ .../modules/test_bitmapset/test_bitmapset.c | 101 ++++++++++++- 10 files changed, 374 insertions(+), 52 deletions(-) diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c index f053d8c4d64..9cb4e252b2c 100644 --- a/src/backend/nodes/bitmapset.c +++ b/src/backend/nodes/bitmapset.c @@ -39,6 +39,7 @@ #include "postgres.h" #include "common/hashfn.h" +#include "common/int.h" #include "nodes/bitmapset.h" #include "nodes/pg_list.h" #include "port/pg_bitutils.h" @@ -405,6 +406,140 @@ bms_difference(const Bitmapset *a, const Bitmapset *b) return result; } +/* + * bms_offset_members + * Creates a new Bitmapset with all members of 'a' adjusted to add the + * value of 'offset' to each member. + * + * Members which would become negative as a result of a negative offset will + * be removed from the set, whereas too large an offset, which would result in + * a member going > INT_MAX, will result in an ERROR. + */ +Bitmapset * +bms_offset_members(const Bitmapset *a, int offset) +{ + Bitmapset *result; + int offset_words; + int offset_bits; + int new_nwords; + int old_nwords; + int32 high_bit; + int old_highest; + int new_highest; + + Assert(bms_is_valid_set(a)); + + /* nothing to do for empty sets */ + if (a == NULL) + return NULL; + + old_nwords = a->nwords; + offset_words = WORDNUM(offset); + offset_bits = BITNUM(offset); + high_bit = bmw_leftmost_one_pos(a->words[a->nwords - 1]); + old_highest = (old_nwords - 1) * BITS_PER_BITMAPWORD + high_bit; + + /* don't create a set with a member that doesn't fit into an int32 */ + if (pg_add_s32_overflow(old_highest, offset, &new_highest)) + elog(ERROR, "bitmapset overflow"); + /* return NULL if the new set would be empty */ + else if (new_highest < 0) + return NULL; + + new_nwords = WORDNUM(new_highest) + 1; + result = (Bitmapset *) palloc0(BITMAPSET_SIZE(new_nwords)); + result->type = T_Bitmapset; + result->nwords = new_nwords; + + /* handle zero and positive offsets (bitshift left) */ + if (offset >= 0) + { + /* + * We special-case offsetting only by whole words so we don't have to + * special-case bitshifting by BITS_PER_BITMAPWORD places, which has + * an undefined behavior. + */ + if (offset_bits == 0) + { + int i = 0; + + /* + * The old set is guaranteed to have at least 1 word, so use + * do/while to save the redundant initial loop bounds check. + */ + do + { + Assert(i + offset_words < new_nwords); + result->words[i + offset_words] = a->words[i]; + } while (++i < old_nwords); + } + else + { + int carry_bits = BITS_PER_BITMAPWORD - offset_bits; + bitmapword prev_carry = 0; + int i = 0; + + do + { + bitmapword carry = (a->words[i] >> carry_bits); + + Assert(i + offset_words < new_nwords); + /* shift bits up and carry bits from the previous word */ + result->words[i + offset_words] = (a->words[i] << offset_bits) | prev_carry; + prev_carry = carry; + } while (++i < old_nwords); + result->words[new_nwords - 1] |= prev_carry; + } + } + + /* handle negative offset (bitshift right) */ + else + { + /* make the negative offset_words and offset_bits positive */ + offset_words = 0 - offset_words; + offset_bits = 0 - offset_bits; + + /* as above, special case shifting only by whole words */ + if (offset_bits == 0) + { + int i = 0; + + do + { + Assert(i + offset_words < old_nwords); + result->words[i] = a->words[i + offset_words]; + } while (++i < new_nwords); + } + else + { + int carry_bits = BITS_PER_BITMAPWORD - offset_bits; + bitmapword prev_carry = 0; + int i = new_nwords - 1; + + /* carry bits from any word just above where the loop starts */ + if (old_nwords > new_nwords + offset_words) + prev_carry = (a->words[new_nwords + offset_words] << carry_bits); + + /* + * We loop backward over the array so we correctly carry bits from + * higher words. + */ + do + { + bitmapword carry = (a->words[i + offset_words] << carry_bits); + + Assert(i + offset_words < old_nwords); + + /* shift bits down and carry bits from the previous word */ + result->words[i] = (a->words[i + offset_words] >> offset_bits) | prev_carry; + prev_carry = carry; + } while (--i >= 0); + } + } + + return result; +} + /* * bms_is_subset - is A a subset of B? */ diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c index ff0e875f2a2..ebae7a8dea9 100644 --- a/src/backend/optimizer/plan/setrefs.c +++ b/src/backend/optimizer/plan/setrefs.c @@ -2057,16 +2057,10 @@ set_hash_references(PlannerInfo *root, Plan *plan, int rtoffset) static Relids offset_relid_set(Relids relids, int rtoffset) { - Relids result = NULL; - int rtindex; - - /* If there's no offset to apply, we needn't recompute the value */ + /* If there's no offset to apply, we needn't make another set */ if (rtoffset == 0) return relids; - rtindex = -1; - while ((rtindex = bms_next_member(relids, rtindex)) >= 0) - result = bms_add_member(result, rtindex + rtoffset); - return result; + return bms_offset_members(relids, rtoffset); } /* diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 95bf51606cc..78281262ba1 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -3731,7 +3731,7 @@ has_notnull_forced_var(PlannerInfo *root, List *forced_null_vars, RangeTblEntry *rte; Bitmapset *notnullattnums; Bitmapset *forcednullattnums = NULL; - int attno; + int lowest_attno; varno++; @@ -3751,22 +3751,22 @@ has_notnull_forced_var(PlannerInfo *root, List *forced_null_vars, if (bms_is_member(varno, right_state->nullable_rels)) continue; - /* - * Iterate over attributes and adjust the bitmap indexes by - * FirstLowInvalidHeapAttributeNumber to get the actual attribute - * numbers. - */ - attno = -1; - while ((attno = bms_next_member(attrs, attno)) >= 0) - { - AttrNumber real_attno = attno + FirstLowInvalidHeapAttributeNumber; + /* find the lowest member to check if system columns are present */ + lowest_attno = bms_next_member(attrs, -1); - /* system columns cannot be NULL */ - if (real_attno < 0) - return true; + /* we checked for an empty set above */ + Assert(lowest_attno >= 0); - forcednullattnums = bms_add_member(forcednullattnums, real_attno); - } + /* system columns cannot be NULL */ + if (lowest_attno + FirstLowInvalidHeapAttributeNumber < 0) + return true; + + /* + * Offset the bitmap members by FirstLowInvalidHeapAttributeNumber to + * get the actual attribute numbers. + */ + forcednullattnums = bms_offset_members(attrs, + FirstLowInvalidHeapAttributeNumber); rte = rt_fetch(varno, root->parse->rtable); diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c index 4bf4aa0d6d1..6137f4303f2 100644 --- a/src/backend/rewrite/rewriteManip.c +++ b/src/backend/rewrite/rewriteManip.c @@ -64,7 +64,6 @@ static bool contain_windowfuncs_walker(Node *node, void *context); static bool locate_windowfunc_walker(Node *node, locate_windowfunc_context *context); static bool checkExprHasSubLink_walker(Node *node, void *context); -static Relids offset_relid_set(Relids relids, int offset); static Node *add_nulling_relids_mutator(Node *node, add_nulling_relids_context *context); static Node *remove_nulling_relids_mutator(Node *node, @@ -397,8 +396,8 @@ OffsetVarNodes_walker(Node *node, OffsetVarNodes_context *context) if (var->varlevelsup == context->sublevels_up) { var->varno += context->offset; - var->varnullingrels = offset_relid_set(var->varnullingrels, - context->offset); + var->varnullingrels = bms_offset_members(var->varnullingrels, + context->offset); if (var->varnosyn > 0) var->varnosyn += context->offset; } @@ -435,10 +434,10 @@ OffsetVarNodes_walker(Node *node, OffsetVarNodes_context *context) if (phv->phlevelsup == context->sublevels_up) { - phv->phrels = offset_relid_set(phv->phrels, - context->offset); - phv->phnullingrels = offset_relid_set(phv->phnullingrels, - context->offset); + phv->phrels = bms_offset_members(phv->phrels, + context->offset); + phv->phnullingrels = bms_offset_members(phv->phnullingrels, + context->offset); } /* fall through to examine children */ } @@ -524,18 +523,6 @@ OffsetVarNodes(Node *node, int offset, int sublevels_up) OffsetVarNodes_walker(node, &context); } -static Relids -offset_relid_set(Relids relids, int offset) -{ - Relids result = NULL; - int rtindex; - - rtindex = -1; - while ((rtindex = bms_next_member(relids, rtindex)) >= 0) - result = bms_add_member(result, rtindex + offset); - return result; -} - /* * ChangeVarNodes - adjust Var nodes for a specific change of RT index * diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 2b83355d26e..332e7423bd8 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1673,8 +1673,7 @@ statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid, */ if (!leakproof) { - Bitmapset *clause_attnums = NULL; - int attnum = -1; + Bitmapset *clause_attnums; /* * We have to check per-column privileges. *attnums has the attnums @@ -1685,13 +1684,8 @@ statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid, * while attnums within *attnums aren't. Convert *attnums to the * offset style so we can combine the results. */ - while ((attnum = bms_next_member(*attnums, attnum)) >= 0) - { - clause_attnums = - bms_add_member(clause_attnums, - attnum - FirstLowInvalidHeapAttributeNumber); - } - + clause_attnums = bms_offset_members(*attnums, + 0 - FirstLowInvalidHeapAttributeNumber); /* Now merge attnums from *exprs into clause_attnums */ if (*exprs != NIL) pull_varattnos((Node *) *exprs, relid, &clause_attnums); diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h index 067ec72e99b..997f8a1cd96 100644 --- a/src/include/nodes/bitmapset.h +++ b/src/include/nodes/bitmapset.h @@ -100,6 +100,7 @@ extern void bms_free(Bitmapset *a); extern Bitmapset *bms_union(const Bitmapset *a, const Bitmapset *b); extern Bitmapset *bms_intersect(const Bitmapset *a, const Bitmapset *b); extern Bitmapset *bms_difference(const Bitmapset *a, const Bitmapset *b); +extern Bitmapset *bms_offset_members(const Bitmapset *a, int offset); extern bool bms_is_subset(const Bitmapset *a, const Bitmapset *b); extern BMS_Comparison bms_subset_compare(const Bitmapset *a, const Bitmapset *b); extern bool bms_is_member(int x, const Bitmapset *a); diff --git a/src/test/modules/test_bitmapset/expected/test_bitmapset.out b/src/test/modules/test_bitmapset/expected/test_bitmapset.out index 0b72b91cd1f..0574b047844 100644 --- a/src/test/modules/test_bitmapset/expected/test_bitmapset.out +++ b/src/test/modules/test_bitmapset/expected/test_bitmapset.out @@ -528,6 +528,80 @@ SELECT test_bms_difference(NULL, NULL) AS result; <> (1 row) +-- bms_offset_members() +-- Ensure overflow detection works +SELECT test_bms_offset_members('(b 1)', 2147483647); +ERROR: bitmapset overflow +SELECT test_bms_offset_members('(b 2)', 2147483646); +ERROR: bitmapset overflow +-- Ensure members are all offset as expected +SELECT test_bms_offset_members('(b 1 3 5)', 1); + test_bms_offset_members +------------------------- + (b 2 4 6) +(1 row) + +SELECT test_bms_offset_members('(b 1 3 5)', 0); + test_bms_offset_members +------------------------- + (b 1 3 5) +(1 row) + +SELECT test_bms_offset_members('(b 1 3 5)', -1); + test_bms_offset_members +------------------------- + (b 0 2 4) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', 1); + test_bms_offset_members +------------------------- + (b 32 33 64 65) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', 0); + test_bms_offset_members +------------------------- + (b 31 32 63 64) +(1 row) + +SELECT test_bms_offset_members('(b 31 32 63 64)', -1); + test_bms_offset_members +------------------------- + (b 30 31 62 63) +(1 row) + +SELECT test_bms_offset_members('(b 1 2 3)', 64); + test_bms_offset_members +------------------------- + (b 65 66 67) +(1 row) + +SELECT test_bms_offset_members('(b 65 66 67)', -64); + test_bms_offset_members +------------------------- + (b 1 2 3) +(1 row) + +-- Ensure members going below zero silently fall off the set +SELECT test_bms_offset_members('(b 0 1 2 10)', -2); + test_bms_offset_members +------------------------- + (b 0 8) +(1 row) + +SELECT test_bms_offset_members('(b 1 2 3)', -10); + test_bms_offset_members +------------------------- + <> +(1 row) + +SELECT test_bms_offset_members('(b 0)', -2147483648); + test_bms_offset_members +------------------------- + <> +(1 row) + -- bms_is_member() SELECT test_bms_is_member('(b)', -5); -- error ERROR: negative bitmapset member not allowed @@ -1575,4 +1649,11 @@ SELECT test_random_operations(NULL, 10000, 81920, 0) > 0 AS result; t (1 row) +-- perform some random tests on bms_offset_members() +SELECT test_random_offset_operations(NULL, 10000, 1024, 0) AS result; + result +-------- + 10000 +(1 row) + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql index c53232e0ada..1d2916dcaa5 100644 --- a/src/test/modules/test_bitmapset/sql/test_bitmapset.sql +++ b/src/test/modules/test_bitmapset/sql/test_bitmapset.sql @@ -145,6 +145,26 @@ SELECT test_bms_difference('(b 5)', NULL) AS result; SELECT test_bms_difference(NULL, '(b 5)') AS result; SELECT test_bms_difference(NULL, NULL) AS result; +-- bms_offset_members() +-- Ensure overflow detection works +SELECT test_bms_offset_members('(b 1)', 2147483647); +SELECT test_bms_offset_members('(b 2)', 2147483646); + +-- Ensure members are all offset as expected +SELECT test_bms_offset_members('(b 1 3 5)', 1); +SELECT test_bms_offset_members('(b 1 3 5)', 0); +SELECT test_bms_offset_members('(b 1 3 5)', -1); +SELECT test_bms_offset_members('(b 31 32 63 64)', 1); +SELECT test_bms_offset_members('(b 31 32 63 64)', 0); +SELECT test_bms_offset_members('(b 31 32 63 64)', -1); +SELECT test_bms_offset_members('(b 1 2 3)', 64); +SELECT test_bms_offset_members('(b 65 66 67)', -64); + +-- Ensure members going below zero silently fall off the set +SELECT test_bms_offset_members('(b 0 1 2 10)', -2); +SELECT test_bms_offset_members('(b 1 2 3)', -10); +SELECT test_bms_offset_members('(b 0)', -2147483648); + -- bms_is_member() SELECT test_bms_is_member('(b)', -5); -- error SELECT test_bms_is_member('(b 1 3 5)', 1) AS result; @@ -403,4 +423,7 @@ SELECT test_bms_nonempty_difference('(b 1 2)', '(b 50 100)') AS result; -- random operations SELECT test_random_operations(NULL, 10000, 81920, 0) > 0 AS result; +-- perform some random tests on bms_offset_members() +SELECT test_random_offset_operations(NULL, 10000, 1024, 0) AS result; + DROP EXTENSION test_bitmapset; diff --git a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql index e7b263e51f5..102fdd69478 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql +++ b/src/test/modules/test_bitmapset/test_bitmapset--1.0.sql @@ -56,6 +56,10 @@ CREATE FUNCTION test_bms_difference(text, text) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_bms_offset_members(text, integer) +RETURNS text +AS 'MODULE_PATHNAME' LANGUAGE C; + CREATE FUNCTION test_bms_is_empty(text) RETURNS boolean AS 'MODULE_PATHNAME' LANGUAGE C; @@ -137,4 +141,8 @@ CREATE FUNCTION test_random_operations(bigint, integer, integer, integer) RETURNS integer AS 'MODULE_PATHNAME' LANGUAGE C; +CREATE FUNCTION test_random_offset_operations(bigint, integer, integer, integer) +RETURNS integer +AS 'MODULE_PATHNAME' LANGUAGE C; + COMMENT ON EXTENSION test_bitmapset IS 'Test code for Bitmapset'; diff --git a/src/test/modules/test_bitmapset/test_bitmapset.c b/src/test/modules/test_bitmapset/test_bitmapset.c index 3a185369651..950795ae611 100644 --- a/src/test/modules/test_bitmapset/test_bitmapset.c +++ b/src/test/modules/test_bitmapset/test_bitmapset.c @@ -44,6 +44,7 @@ PG_FUNCTION_INFO_V1(test_bms_subset_compare); PG_FUNCTION_INFO_V1(test_bms_union); PG_FUNCTION_INFO_V1(test_bms_intersect); PG_FUNCTION_INFO_V1(test_bms_difference); +PG_FUNCTION_INFO_V1(test_bms_offset_members); PG_FUNCTION_INFO_V1(test_bms_is_empty); PG_FUNCTION_INFO_V1(test_bms_membership); PG_FUNCTION_INFO_V1(test_bms_singleton_member); @@ -66,6 +67,7 @@ PG_FUNCTION_INFO_V1(test_bitmap_match); /* Test utility functions */ PG_FUNCTION_INFO_V1(test_random_operations); +PG_FUNCTION_INFO_V1(test_random_offset_operations); /* Convenient macros to test results */ #define EXPECT_TRUE(expr) \ @@ -296,6 +298,17 @@ test_bms_difference(PG_FUNCTION_ARGS) PG_RETURN_BITMAPSET_AS_TEXT(result_bms); } +Datum +test_bms_offset_members(PG_FUNCTION_ARGS) +{ + Bitmapset *bms = PG_ARG_GETBITMAPSET(0); + int offset = PG_GETARG_INT32(1); + + bms = bms_offset_members(bms, offset); + + PG_RETURN_BITMAPSET_AS_TEXT(bms); +} + Datum test_bms_compare(PG_FUNCTION_ARGS) { @@ -581,7 +594,7 @@ test_bitmap_match(PG_FUNCTION_ARGS) } /* - * Contrary to all the other functions which are one-one mappings with the + * Contrary to most of the other functions which are one-one mappings with the * equivalent C functions, this stresses Bitmapsets in a random fashion for * various operations. * @@ -766,3 +779,89 @@ test_random_operations(PG_FUNCTION_ARGS) PG_RETURN_INT32(total_ops); } + +/* + * Random testing for bms_offset_members(). Generates a random set and then + * picks a number to offset the members by. We then create another set, which + * is built by looping over the members of the random set and performing + * bms_add_member and adding on the offset to create a known good set to + * compare the result of bms_offset_members() to. + * + * Arguments: + * arg1: optional random seed. NULL means use a random seed. + * arg2: the number of operations to perform. + * arg3: the maximum bitmapset member number to use in the random set. + * arg4: the minimum bitmapset member number to use in the random set. + */ +Datum +test_random_offset_operations(PG_FUNCTION_ARGS) +{ + pg_prng_state state; + uint64 seed; + int num_ops; + int max_range; + int min_value; + int member; + + if (PG_ARGISNULL(0)) + seed = GetCurrentTimestamp(); + else + seed = PG_GETARG_INT64(0); + + num_ops = PG_GETARG_INT32(1); + max_range = PG_GETARG_INT32(2); + min_value = PG_GETARG_INT32(3); + + if (PG_ARGISNULL(1) || num_ops <= 0) + elog(ERROR, "invalid number of operations"); + if (PG_ARGISNULL(2) || max_range <= 0) + elog(ERROR, "invalid maximum range"); + if (PG_ARGISNULL(3) || min_value < 0) + elog(ERROR, "invalid minimum value"); + + pg_prng_seed(&state, seed); + + for (int op = 0; op < num_ops; op++) + { + Bitmapset *random_bms = NULL; + Bitmapset *offset_bms1; + Bitmapset *offset_bms2 = NULL; + int offset; + int nmembers; + + CHECK_FOR_INTERRUPTS(); + + /* Figure out a random offset and how many members to add */ + offset = (pg_prng_uint32(&state) % max_range) - (pg_prng_uint32(&state) % max_range); + nmembers = pg_prng_uint32(&state) % max_range + min_value; + + for (int i = 0; i < nmembers; i++) + { + member = pg_prng_uint32(&state) % max_range + min_value; + random_bms = bms_add_member(random_bms, member); + } + + /* create a known-good set the old fashioned way */ + offset_bms2 = NULL; + member = -1; + while ((member = bms_next_member(random_bms, member)) >= 0) + { + if (member + offset >= 0) + offset_bms2 = bms_add_member(offset_bms2, member + offset); + } + + /* do the offsetting */ + offset_bms1 = bms_offset_members(random_bms, offset); + + /* check against the known-good set */ + if (!bms_equal(offset_bms1, offset_bms2)) + elog(ERROR, "bms_offset_members failed with offset %d seed " INT64_FORMAT, offset, seed); + + /* Cleanup before the next loop */ + bms_free(random_bms); + bms_free(offset_bms1); + bms_free(offset_bms2); + } + + PG_RETURN_INT32(num_ops); +} -- 2.51.0 ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-20 14:55 Greg Burd <[email protected]> parent: David Rowley <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Greg Burd @ 2026-04-20 14:55 UTC (permalink / raw) To: David Rowley <[email protected]>; +Cc: Tom Lane <[email protected]>; Chao Li <[email protected]>; PostgreSQL Hackers <[email protected]> On Sun, Apr 19, 2026, at 7:52 PM, David Rowley wrote: > On Mon, 20 Apr 2026 at 07:22, Greg Burd <[email protected]> wrote: >> I applied, tested, and reviewed these changes. Thanks for doing this, only a few small things jumped out. > > Many thanks. I took all of those suggestions. Happy to help. >> SELECT test_bms_offset_members('(b 1)', -2147483648); > > I made that one use member 0 instead of 1. That'll mean "new_highest" > goes to INT_MIN rather than INT_MIN + 1. Perfect, that covers the gap nicely. Were you planning on writing the optimized non-copy version as well? I don't think it is strictly necessary, more a curiosity. bms_offset_members() -> new bms, might repalloc() replaces existing loops you've found bms_shift_members() -> bms is modified in place and fits your new use case a bit better best. -greg > David > > Attachments: > * v3-0001-Introduce-bms_offset_members-function.patch ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add bms_offset_members() function for bitshifting Bitmapsets @ 2026-04-21 01:40 David Rowley <[email protected]> parent: Greg Burd <[email protected]> 0 siblings, 0 replies; 15+ messages in thread From: David Rowley @ 2026-04-21 01:40 UTC (permalink / raw) To: Greg Burd <[email protected]>; +Cc: Tom Lane <[email protected]>; Chao Li <[email protected]>; PostgreSQL Hackers <[email protected]> On Tue, 21 Apr 2026 at 02:55, Greg Burd <[email protected]> wrote: > Were you planning on writing the optimized non-copy version as well? I don't think it is strictly necessary, more a curiosity. > > bms_offset_members() -> new bms, might repalloc() replaces existing loops you've found > bms_shift_members() -> bms is modified in place and fits your new use case a bit better Not at this stage. The v1 patch did modify the set in-place, so the code is there if we ever need it. I didn't find any need for it in our current code. The selective tuple deforming patch I'm working on could use it, but I doubt it's worth the trouble for 1 caller. It's just for something that happens during create_plan(), so 1 more allocation in that code likely isn't going to be noticed. David ^ permalink raw reply [nested|flat] 15+ messages in thread
end of thread, other threads:[~2026-04-21 01:40 UTC | newest] Thread overview: 15+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-04-13 04:35 Add bms_offset_members() function for bitshifting Bitmapsets David Rowley <[email protected]> 2026-04-14 08:46 ` Chao Li <[email protected]> 2026-04-14 23:45 ` David Rowley <[email protected]> 2026-04-15 00:29 ` Tom Lane <[email protected]> 2026-04-15 02:21 ` David Rowley <[email protected]> 2026-04-15 02:30 ` Tom Lane <[email protected]> 2026-04-15 02:33 ` David Rowley <[email protected]> 2026-04-15 19:17 ` Peter Eisentraut <[email protected]> 2026-04-18 07:53 ` David Rowley <[email protected]> 2026-04-18 15:09 ` Tom Lane <[email protected]> 2026-04-18 07:49 ` David Rowley <[email protected]> 2026-04-19 19:21 ` Greg Burd <[email protected]> 2026-04-19 23:52 ` David Rowley <[email protected]> 2026-04-20 14:55 ` Greg Burd <[email protected]> 2026-04-21 01:40 ` David Rowley <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox