public inbox for [email protected]
help / color / mirror / Atom feedAssertion failure in hash_kill_items()
4+ messages / 2 participants
[nested] [flat]
* Assertion failure in hash_kill_items()
@ 2026-03-17 17:15 Heikki Linnakangas <[email protected]>
2026-03-17 17:40 ` Re: Assertion failure in hash_kill_items() Andres Freund <[email protected]>
0 siblings, 1 reply; 4+ messages in thread
From: Heikki Linnakangas @ 2026-03-17 17:15 UTC (permalink / raw)
To: Andres Freund <[email protected]>; pgsql-hackers; +Cc: Alexander Kuzmenkov <[email protected]>
I bumped into an assertion failure, while playing with variants of the
test case that Alexander Kuzmenkov wrote to exercise hash index page
cleanup [1]. This is master-only, related to the recent changes in how
buffers are marked dirty.
CREATE TABLE hash_cleanup_heap(keycol INT);
CREATE INDEX hash_cleanup_index on hash_cleanup_heap USING HASH (keycol);
BEGIN;
INSERT INTO hash_cleanup_heap SELECT 1 FROM generate_series(1, 500) as i;
ROLLBACK;
SET enable_seqscan = off;
SET enable_bitmapscan = off;
SELECT count(*) FROM hash_cleanup_heap WHERE keycol = 1;
TRAP: failed Assert("BufferIsValid(buffer)"), File:
"../src/backend/storage/buffer/bufmgr.c", Line: 509, PID: 1275145
postgres: heikki postgres [local] SELECT(ExceptionalCondition+0x84)
[0xaaaaecab8650]
postgres: heikki postgres [local] SELECT(+0x20498b4) [0xaaaaec4698b4]
postgres: heikki postgres [local] SELECT(+0x205db78) [0xaaaaec47db78]
postgres: heikki postgres [local] SELECT(BufferBeginSetHintBits+0x44)
[0xaaaaec47df58]
postgres: heikki postgres [local] SELECT(_hash_kill_items+0xa8c)
[0xaaaaeb6a51c8]
postgres: heikki postgres [local] SELECT(_hash_next+0x2a8) [0xaaaaeb69d780]
postgres: heikki postgres [local] SELECT(hashgettuple+0x5a4)
[0xaaaaeb682920]
postgres: heikki postgres [local] SELECT(index_getnext_tid+0x4b4)
[0xaaaaeb73322c]
postgres: heikki postgres [local] SELECT(index_getnext_slot+0x90)
[0xaaaaeb733ebc]
postgres: heikki postgres [local] SELECT(+0x19507d8) [0xaaaaebd707d8]
postgres: heikki postgres [local] SELECT(+0x189a47c) [0xaaaaebcba47c]
postgres: heikki postgres [local] SELECT(+0x189a588) [0xaaaaebcba588]
postgres: heikki postgres [local] SELECT(ExecScan+0x18c) [0xaaaaebcbab24]
postgres: heikki postgres [local] SELECT(+0x1953a00) [0xaaaaebd73a00]
postgres: heikki postgres [local] SELECT(+0x188bf9c) [0xaaaaebcabf9c]
postgres: heikki postgres [local] SELECT(+0x18cb754) [0xaaaaebceb754]
postgres: heikki postgres [local] SELECT(+0x18ccd70) [0xaaaaebcecd70]
postgres: heikki postgres [local] SELECT(+0x18dae40) [0xaaaaebcfae40]
postgres: heikki postgres [local] SELECT(+0x18d98d8) [0xaaaaebcf98d8]
postgres: heikki postgres [local] SELECT(+0x188bf9c) [0xaaaaebcabf9c]
postgres: heikki postgres [local] SELECT(+0x1858814) [0xaaaaebc78814]
postgres: heikki postgres [local] SELECT(+0x1863318) [0xaaaaebc83318]
postgres: heikki postgres [local] SELECT(standard_ExecutorRun+0x594)
[0xaaaaebc7a034]
The first attached patch fixes it. It's pretty straightforward: the
function was using so->currPos.buf, but that's only valid if the page
was already pinned on entry to the function. It should be using the
local 'buf' variable instead.
The second patch simplifies the condition in the 'unlock_page' part.
This isn't new, and isn't needed to fix the bug, it just caught my eye
while looking at this. I don't understand why the condition was the way
it was, checking just 'havePin' seems sufficient and more correct to me.
Am I missing something?
[1]
https://www.postgresql.org/message-id/CALzhyqxrc1ZHYmf5V8NE%2ByMboqVg7xZrQM7K2c7VS0p1v8z42w%40mail.g...
- Heikki
Attachments:
[text/x-patch] 0001-fix-crash-on-killing-items-on-hash-overflow-pages.patch (1.1K, 2-0001-fix-crash-on-killing-items-on-hash-overflow-pages.patch)
download | inline diff:
From de433e4092e33e49a5f01c71d5ab6b34320167cd Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 17 Mar 2026 19:06:04 +0200
Subject: [PATCH 1/2] fix crash on killing items on hash overflow pages
---
src/backend/access/hash/hashutil.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/hash/hashutil.c b/src/backend/access/hash/hashutil.c
index 3e16119d027..081adbc88a6 100644
--- a/src/backend/access/hash/hashutil.c
+++ b/src/backend/access/hash/hashutil.c
@@ -600,7 +600,7 @@ _hash_kill_items(IndexScanDesc scan)
* update the page while just holding a share lock. If we
* are not allowed, there's no point continuing.
*/
- if (!BufferBeginSetHintBits(so->currPos.buf))
+ if (!BufferBeginSetHintBits(buf))
goto unlock_page;
}
@@ -621,7 +621,7 @@ _hash_kill_items(IndexScanDesc scan)
if (killedsomething)
{
opaque->hasho_flag |= LH_PAGE_HAS_DEAD_TUPLES;
- BufferFinishSetHintBits(so->currPos.buf, true, true);
+ BufferFinishSetHintBits(buf, true, true);
}
unlock_page:
--
2.47.3
[text/x-patch] 0002-simplify-the-condition-in-unlock_page.patch (778B, 3-0002-simplify-the-condition-in-unlock_page.patch)
download | inline diff:
From 8e7e3f6e63c92db8a7d8139ec60c0fdee98830ac Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 17 Mar 2026 19:08:21 +0200
Subject: [PATCH 2/2] simplify the condition in 'unlock_page'
---
src/backend/access/hash/hashutil.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/backend/access/hash/hashutil.c b/src/backend/access/hash/hashutil.c
index 081adbc88a6..1d1b05f876a 100644
--- a/src/backend/access/hash/hashutil.c
+++ b/src/backend/access/hash/hashutil.c
@@ -625,8 +625,7 @@ _hash_kill_items(IndexScanDesc scan)
}
unlock_page:
- if (so->hashso_bucket_buf == so->currPos.buf ||
- havePin)
+ if (havePin)
LockBuffer(so->currPos.buf, BUFFER_LOCK_UNLOCK);
else
_hash_relbuf(rel, buf);
--
2.47.3
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: Assertion failure in hash_kill_items()
2026-03-17 17:15 Assertion failure in hash_kill_items() Heikki Linnakangas <[email protected]>
@ 2026-03-17 17:40 ` Andres Freund <[email protected]>
2026-03-17 20:50 ` Re: Assertion failure in hash_kill_items() Andres Freund <[email protected]>
2026-04-07 14:38 ` Re: Assertion failure in hash_kill_items() Heikki Linnakangas <[email protected]>
0 siblings, 2 replies; 4+ messages in thread
From: Andres Freund @ 2026-03-17 17:40 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: pgsql-hackers; Alexander Kuzmenkov <[email protected]>; Ashutosh Sharma <[email protected]>
Hi,
On 2026-03-17 19:15:10 +0200, Heikki Linnakangas wrote:
> I bumped into an assertion failure, while playing with variants of the test
> case that Alexander Kuzmenkov wrote to exercise hash index page cleanup [1].
> This is master-only, related to the recent changes in how buffers are marked
> dirty.
Sorry, I had hoped to push a fix for that already, after it was reported in
https://postgr.es/m/vjtmvwvbxt7w5uyacxpzibpj65ewcb7uqaqbhd4arvnjbp5jqz%405ksdh6fsyqve
but real life intervened.
I was planning to commit it together with an addition to
src/test/modules/index/specs/killtuples.spec
Unfortunately that made the change a good bit more verbose, as a naive
addition would report a number of buffer accesses that seemed not necessarily
reliable to me. So I updated the 'result' step to just return true/false
depending on whether there were any accesses.
I'll go and work on pushing that.
> The first attached patch fixes it. It's pretty straightforward: the function
> was using so->currPos.buf, but that's only valid if the page was already
> pinned on entry to the function. It should be using the local 'buf' variable
> instead.
Yea, that was a stupid bug on my part. No idea how I ended up with it. At
first I thought it might have been a rebase issue, but I didn't see any
relevant change that could have caused that.
> The second patch simplifies the condition in the 'unlock_page' part. This
> isn't new, and isn't needed to fix the bug, it just caught my eye while
> looking at this. I don't understand why the condition was the way it was,
> checking just 'havePin' seems sufficient and more correct to me. Am I
> missing something?
I can't see anything either, quite odd. Most likely explanation seems to be
that something changed during the development of 7c75ef571579.
Indeed, the first version of the patch from
https://postgr.es/m/CAE9k0Pm3KTx93K8_5j6VMzG4h5F%2BSyknxUwXrN-zqSZ9X8ZS3w%40mail.gmail.com
was using "if (so->hashso_bucket_buf == so->currPos.buf)" both at the start
and end of _hash_kill_items(). So likely it was just an accident during patch
revisions.
Greetings,
Andres Freund
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: Assertion failure in hash_kill_items()
2026-03-17 17:15 Assertion failure in hash_kill_items() Heikki Linnakangas <[email protected]>
2026-03-17 17:40 ` Re: Assertion failure in hash_kill_items() Andres Freund <[email protected]>
@ 2026-03-17 20:50 ` Andres Freund <[email protected]>
1 sibling, 0 replies; 4+ messages in thread
From: Andres Freund @ 2026-03-17 20:50 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: pgsql-hackers; Alexander Kuzmenkov <[email protected]>; Ashutosh Sharma <[email protected]>
Hi,
On 2026-03-17 13:40:10 -0400, Andres Freund wrote:
> On 2026-03-17 19:15:10 +0200, Heikki Linnakangas wrote:
> > I bumped into an assertion failure, while playing with variants of the test
> > case that Alexander Kuzmenkov wrote to exercise hash index page cleanup [1].
> > This is master-only, related to the recent changes in how buffers are marked
> > dirty.
>
> Sorry, I had hoped to push a fix for that already, after it was reported in
> https://postgr.es/m/vjtmvwvbxt7w5uyacxpzibpj65ewcb7uqaqbhd4arvnjbp5jqz%405ksdh6fsyqve
> but real life intervened.
>
> I was planning to commit it together with an addition to
> src/test/modules/index/specs/killtuples.spec
>
> Unfortunately that made the change a good bit more verbose, as a naive
> addition would report a number of buffer accesses that seemed not necessarily
> reliable to me. So I updated the 'result' step to just return true/false
> depending on whether there were any accesses.
>
> I'll go and work on pushing that.
Done, as of f5eb854ab6d.
Greetings,
Andres
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: Assertion failure in hash_kill_items()
2026-03-17 17:15 Assertion failure in hash_kill_items() Heikki Linnakangas <[email protected]>
2026-03-17 17:40 ` Re: Assertion failure in hash_kill_items() Andres Freund <[email protected]>
@ 2026-04-07 14:38 ` Heikki Linnakangas <[email protected]>
1 sibling, 0 replies; 4+ messages in thread
From: Heikki Linnakangas @ 2026-04-07 14:38 UTC (permalink / raw)
To: Andres Freund <[email protected]>; +Cc: pgsql-hackers; Alexander Kuzmenkov <[email protected]>; Ashutosh Sharma <[email protected]>
On 17/03/2026 19:40, Andres Freund wrote:
> On 2026-03-17 19:15:10 +0200, Heikki Linnakangas wrote:
>> The second patch simplifies the condition in the 'unlock_page' part. This
>> isn't new, and isn't needed to fix the bug, it just caught my eye while
>> looking at this. I don't understand why the condition was the way it was,
>> checking just 'havePin' seems sufficient and more correct to me. Am I
>> missing something?
>
> I can't see anything either, quite odd. Most likely explanation seems to be
> that something changed during the development of 7c75ef571579.
>
>
> Indeed, the first version of the patch from
> https://postgr.es/m/CAE9k0Pm3KTx93K8_5j6VMzG4h5F%2BSyknxUwXrN-zqSZ9X8ZS3w%40mail.gmail.com
> was using "if (so->hashso_bucket_buf == so->currPos.buf)" both at the start
> and end of _hash_kill_items(). So likely it was just an accident during patch
> revisions.
Thanks for archeological excavation; pushed this second patch now.
- Heikki
^ permalink raw reply [nested|flat] 4+ messages in thread
end of thread, other threads:[~2026-04-07 14:38 UTC | newest]
Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-03-17 17:15 Assertion failure in hash_kill_items() Heikki Linnakangas <[email protected]>
2026-03-17 17:40 ` Andres Freund <[email protected]>
2026-03-17 20:50 ` Andres Freund <[email protected]>
2026-04-07 14:38 ` Heikki Linnakangas <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox