public inbox for [email protected]
help / color / mirror / Atom feedFrom: David Rowley <[email protected]>
To: Chao Li <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: PostgreSQL Developers <[email protected]>
Subject: Re: Small and unlikely overflow hazard in bms_next_member()
Date: Sat, 4 Apr 2026 16:30:32 +1300
Message-ID: <CAApHDvo+WtKHWbgmPA+H2K24P6h6aUJ_9kAdS__G1L9LDFGwgQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAApHDvq0T=iJ0Sf5TNE9yyWwfOeVjmrBt0wSywDnGD9Y4YJQBA@mail.gmail.com>
<[email protected]>
<CAApHDvrvvq_m+nRwjsOpCsFa4EtVtmvJX7zAD=Siria-x6DpbQ@mail.gmail.com>
<CAApHDvqTUm3Cbgz3ZLV+ad8s_HJHZYrVbrBvGyPQdxCRR-6dvA@mail.gmail.com>
<[email protected]>
<CAApHDvocDY0qvhxRNBaNRjEr46j8TsGFig+=6bHN-NYS_dhSCw@mail.gmail.com>
<[email protected]>
On Sat, 4 Apr 2026 at 02:28, Chao Li <[email protected]> wrote:
> In test_bms_next2.c, you set words_to_alloc = 1, which disables the optimization in the unrolled version. If I change words_to_alloc = 2, then on my MacBook M4, the unrolled version seems to win:
> ```
> chaol@ChaodeMacBook-Air test % ./test_bms_next2
> Benchmarking 100000000 iterations...
>
> Original: 0.61559 seconds
> David's: 0.61651 seconds
> Chao's version: 1.06033 seconds
> Unrolled loop: 0.60561 seconds
> 2800000000
> ```
Doing the same locally, here's what I get on my Zen2 machine:
drowley@amd3990x:~$ gcc test_bms_next.c -O2 -o test_bms_next && ./test_bms_next
Benchmarking 100000000 iterations...
Original: 1.21125 seconds
David's: 1.11997 seconds
Chao's version: 1.72662 seconds
Unrolled loop: 1.63090 seconds
2800000000
drowley@amd3990x:~$ clang test_bms_next.c -O2 -o test_bms_next &&
./test_bms_next
Benchmarking 100000000 iterations...
Original: 1.10780 seconds
David's: 1.05968 seconds
Chao's version: 1.87123 seconds
Unrolled loop: 1.11002 seconds
2800000000
> I guess one-word Bitmapset are probably the most common case in practice. So overall, I agree that your version is the best choice. Please go ahead with it as we have already gained both a bug fix and a performance improvement. If we really want to study the performance more deeply, I think that would be a separate topic.
Yeah, but it's better to find bottlenecks and focus there than to
optimise blindly without knowing if it'll make any actual difference.
It's still important to test for regressions when modifying code and
try to avoid them, as it's sometimes not obvious if the code being
modified is critical for performance.
I also don't think we should be doing anything to optimise for
multi-word sets at the expense of slowing down single-word sets. Not
that it's very representative of the real world, but I added some
telemetry to bitmapset.c and I see 43 multi-word sets being created
and 1.45 million single-word sets created during make check, or 1 in
33785, if you prefer.
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 786f343b3c9..21b5053e9a2 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -219,6 +219,10 @@ bms_make_singleton(int x)
int wordnum,
bitnum;
+ if (x >= 64)
+ elog(NOTICE, "multi-word set bms_make_singleton");
+ else
+ elog(NOTICE, "single-word set bms_make_singleton");
if (x < 0)
elog(ERROR, "negative bitmapset member not allowed");
wordnum = WORDNUM(x);
@@ -811,6 +815,9 @@ bms_add_member(Bitmapset *a, int x)
wordnum = WORDNUM(x);
bitnum = BITNUM(x);
+ if (x >= 64 && a->nwords == 1)
+ elog(NOTICE, "multi-set bms_add_member");
+
/* enlarge the set if necessary */
if (wordnum >= a->nwords)
{
Not quite perfect as a set made first as a single word set that later
becomes a multi-word set will be double counted. The number of
operations on the sets is likely more important anyway, not the number
of sets being created. The point is, multi-word sets are rare for most
workloads.
David
view thread (17+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Small and unlikely overflow hazard in bms_next_member()
In-Reply-To: <CAApHDvo+WtKHWbgmPA+H2K24P6h6aUJ_9kAdS__G1L9LDFGwgQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox