public inbox for [email protected]  
help / color / mirror / Atom feed
Assertion failure in hash_kill_items()
4+ messages / 2 participants
[nested] [flat]

* Assertion failure in hash_kill_items()
@ 2026-03-17 17:15  Heikki Linnakangas <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Heikki Linnakangas @ 2026-03-17 17:15 UTC (permalink / raw)
  To: Andres Freund <[email protected]>; pgsql-hackers; +Cc: Alexander Kuzmenkov <[email protected]>

I bumped into an assertion failure, while playing with variants of the 
test case that Alexander Kuzmenkov wrote to exercise hash index page 
cleanup [1]. This is master-only, related to the recent changes in how 
buffers are marked dirty.

CREATE TABLE hash_cleanup_heap(keycol INT);
CREATE INDEX hash_cleanup_index on hash_cleanup_heap USING HASH (keycol);
BEGIN;
INSERT INTO hash_cleanup_heap SELECT 1 FROM generate_series(1, 500) as i;
ROLLBACK;
SET enable_seqscan = off;
SET enable_bitmapscan = off;
SELECT count(*) FROM hash_cleanup_heap WHERE keycol = 1;

TRAP: failed Assert("BufferIsValid(buffer)"), File: 
"../src/backend/storage/buffer/bufmgr.c", Line: 509, PID: 1275145
postgres: heikki postgres [local] SELECT(ExceptionalCondition+0x84) 
[0xaaaaecab8650]
postgres: heikki postgres [local] SELECT(+0x20498b4) [0xaaaaec4698b4]
postgres: heikki postgres [local] SELECT(+0x205db78) [0xaaaaec47db78]
postgres: heikki postgres [local] SELECT(BufferBeginSetHintBits+0x44) 
[0xaaaaec47df58]
postgres: heikki postgres [local] SELECT(_hash_kill_items+0xa8c) 
[0xaaaaeb6a51c8]
postgres: heikki postgres [local] SELECT(_hash_next+0x2a8) [0xaaaaeb69d780]
postgres: heikki postgres [local] SELECT(hashgettuple+0x5a4) 
[0xaaaaeb682920]
postgres: heikki postgres [local] SELECT(index_getnext_tid+0x4b4) 
[0xaaaaeb73322c]
postgres: heikki postgres [local] SELECT(index_getnext_slot+0x90) 
[0xaaaaeb733ebc]
postgres: heikki postgres [local] SELECT(+0x19507d8) [0xaaaaebd707d8]
postgres: heikki postgres [local] SELECT(+0x189a47c) [0xaaaaebcba47c]
postgres: heikki postgres [local] SELECT(+0x189a588) [0xaaaaebcba588]
postgres: heikki postgres [local] SELECT(ExecScan+0x18c) [0xaaaaebcbab24]
postgres: heikki postgres [local] SELECT(+0x1953a00) [0xaaaaebd73a00]
postgres: heikki postgres [local] SELECT(+0x188bf9c) [0xaaaaebcabf9c]
postgres: heikki postgres [local] SELECT(+0x18cb754) [0xaaaaebceb754]
postgres: heikki postgres [local] SELECT(+0x18ccd70) [0xaaaaebcecd70]
postgres: heikki postgres [local] SELECT(+0x18dae40) [0xaaaaebcfae40]
postgres: heikki postgres [local] SELECT(+0x18d98d8) [0xaaaaebcf98d8]
postgres: heikki postgres [local] SELECT(+0x188bf9c) [0xaaaaebcabf9c]
postgres: heikki postgres [local] SELECT(+0x1858814) [0xaaaaebc78814]
postgres: heikki postgres [local] SELECT(+0x1863318) [0xaaaaebc83318]
postgres: heikki postgres [local] SELECT(standard_ExecutorRun+0x594) 
[0xaaaaebc7a034]

The first attached patch fixes it. It's pretty straightforward: the 
function was using so->currPos.buf, but that's only valid if the page 
was already pinned on entry to the function. It should be using the 
local 'buf' variable instead.

The second patch simplifies the condition in the 'unlock_page' part. 
This isn't new, and isn't needed to fix the bug, it just caught my eye 
while looking at this. I don't understand why the condition was the way 
it was, checking just 'havePin' seems sufficient and more correct to me. 
Am I missing something?

[1] 
https://www.postgresql.org/message-id/CALzhyqxrc1ZHYmf5V8NE%2ByMboqVg7xZrQM7K2c7VS0p1v8z42w%40mail.g...

- Heikki


Attachments:

  [text/x-patch] 0001-fix-crash-on-killing-items-on-hash-overflow-pages.patch (1.1K, 2-0001-fix-crash-on-killing-items-on-hash-overflow-pages.patch)
  download | inline diff:
From de433e4092e33e49a5f01c71d5ab6b34320167cd Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 17 Mar 2026 19:06:04 +0200
Subject: [PATCH 1/2] fix crash on killing items on hash overflow pages

---
 src/backend/access/hash/hashutil.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/hash/hashutil.c b/src/backend/access/hash/hashutil.c
index 3e16119d027..081adbc88a6 100644
--- a/src/backend/access/hash/hashutil.c
+++ b/src/backend/access/hash/hashutil.c
@@ -600,7 +600,7 @@ _hash_kill_items(IndexScanDesc scan)
 					 * update the page while just holding a share lock. If we
 					 * are not allowed, there's no point continuing.
 					 */
-					if (!BufferBeginSetHintBits(so->currPos.buf))
+					if (!BufferBeginSetHintBits(buf))
 						goto unlock_page;
 				}
 
@@ -621,7 +621,7 @@ _hash_kill_items(IndexScanDesc scan)
 	if (killedsomething)
 	{
 		opaque->hasho_flag |= LH_PAGE_HAS_DEAD_TUPLES;
-		BufferFinishSetHintBits(so->currPos.buf, true, true);
+		BufferFinishSetHintBits(buf, true, true);
 	}
 
 unlock_page:
-- 
2.47.3



  [text/x-patch] 0002-simplify-the-condition-in-unlock_page.patch (778B, 3-0002-simplify-the-condition-in-unlock_page.patch)
  download | inline diff:
From 8e7e3f6e63c92db8a7d8139ec60c0fdee98830ac Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 17 Mar 2026 19:08:21 +0200
Subject: [PATCH 2/2] simplify the condition in 'unlock_page'

---
 src/backend/access/hash/hashutil.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/backend/access/hash/hashutil.c b/src/backend/access/hash/hashutil.c
index 081adbc88a6..1d1b05f876a 100644
--- a/src/backend/access/hash/hashutil.c
+++ b/src/backend/access/hash/hashutil.c
@@ -625,8 +625,7 @@ _hash_kill_items(IndexScanDesc scan)
 	}
 
 unlock_page:
-	if (so->hashso_bucket_buf == so->currPos.buf ||
-		havePin)
+	if (havePin)
 		LockBuffer(so->currPos.buf, BUFFER_LOCK_UNLOCK);
 	else
 		_hash_relbuf(rel, buf);
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Assertion failure in hash_kill_items()
@ 2026-03-17 17:40  Andres Freund <[email protected]>
  parent: Heikki Linnakangas <[email protected]>
  0 siblings, 2 replies; 4+ messages in thread

From: Andres Freund @ 2026-03-17 17:40 UTC (permalink / raw)
  To: Heikki Linnakangas <[email protected]>; +Cc: pgsql-hackers; Alexander Kuzmenkov <[email protected]>; Ashutosh Sharma <[email protected]>

Hi,

On 2026-03-17 19:15:10 +0200, Heikki Linnakangas wrote:
> I bumped into an assertion failure, while playing with variants of the test
> case that Alexander Kuzmenkov wrote to exercise hash index page cleanup [1].
> This is master-only, related to the recent changes in how buffers are marked
> dirty.

Sorry, I had hoped to push a fix for that already, after it was reported in
  https://postgr.es/m/vjtmvwvbxt7w5uyacxpzibpj65ewcb7uqaqbhd4arvnjbp5jqz%405ksdh6fsyqve
but real life intervened.

I was planning to commit it together with an addition to
  src/test/modules/index/specs/killtuples.spec

Unfortunately that made the change a good bit more verbose, as a naive
addition would report a number of buffer accesses that seemed not necessarily
reliable to me.   So I updated the 'result' step to just return true/false
depending on whether there were any accesses.

I'll go and work on pushing that.


> The first attached patch fixes it. It's pretty straightforward: the function
> was using so->currPos.buf, but that's only valid if the page was already
> pinned on entry to the function. It should be using the local 'buf' variable
> instead.

Yea, that was a stupid bug on my part. No idea how I ended up with it.  At
first I thought it might have been a rebase issue, but I didn't see any
relevant change that could have caused that.


> The second patch simplifies the condition in the 'unlock_page' part. This
> isn't new, and isn't needed to fix the bug, it just caught my eye while
> looking at this. I don't understand why the condition was the way it was,
> checking just 'havePin' seems sufficient and more correct to me. Am I
> missing something?

I can't see anything either, quite odd.  Most likely explanation seems to be
that something changed during the development of 7c75ef571579.


Indeed, the first version of the patch from
   https://postgr.es/m/CAE9k0Pm3KTx93K8_5j6VMzG4h5F%2BSyknxUwXrN-zqSZ9X8ZS3w%40mail.gmail.com
was using "if (so->hashso_bucket_buf == so->currPos.buf)" both at the start
and end of _hash_kill_items(). So likely it was just an accident during patch
revisions.

Greetings,

Andres Freund





^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Assertion failure in hash_kill_items()
@ 2026-03-17 20:50  Andres Freund <[email protected]>
  parent: Andres Freund <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Andres Freund @ 2026-03-17 20:50 UTC (permalink / raw)
  To: Heikki Linnakangas <[email protected]>; +Cc: pgsql-hackers; Alexander Kuzmenkov <[email protected]>; Ashutosh Sharma <[email protected]>

Hi,

On 2026-03-17 13:40:10 -0400, Andres Freund wrote:
> On 2026-03-17 19:15:10 +0200, Heikki Linnakangas wrote:
> > I bumped into an assertion failure, while playing with variants of the test
> > case that Alexander Kuzmenkov wrote to exercise hash index page cleanup [1].
> > This is master-only, related to the recent changes in how buffers are marked
> > dirty.
> 
> Sorry, I had hoped to push a fix for that already, after it was reported in
>   https://postgr.es/m/vjtmvwvbxt7w5uyacxpzibpj65ewcb7uqaqbhd4arvnjbp5jqz%405ksdh6fsyqve
> but real life intervened.
> 
> I was planning to commit it together with an addition to
>   src/test/modules/index/specs/killtuples.spec
> 
> Unfortunately that made the change a good bit more verbose, as a naive
> addition would report a number of buffer accesses that seemed not necessarily
> reliable to me.   So I updated the 'result' step to just return true/false
> depending on whether there were any accesses.
> 
> I'll go and work on pushing that.

Done, as of f5eb854ab6d.

Greetings,

Andres





^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Assertion failure in hash_kill_items()
@ 2026-04-07 14:38  Heikki Linnakangas <[email protected]>
  parent: Andres Freund <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Heikki Linnakangas @ 2026-04-07 14:38 UTC (permalink / raw)
  To: Andres Freund <[email protected]>; +Cc: pgsql-hackers; Alexander Kuzmenkov <[email protected]>; Ashutosh Sharma <[email protected]>

On 17/03/2026 19:40, Andres Freund wrote:
> On 2026-03-17 19:15:10 +0200, Heikki Linnakangas wrote:
>> The second patch simplifies the condition in the 'unlock_page' part. This
>> isn't new, and isn't needed to fix the bug, it just caught my eye while
>> looking at this. I don't understand why the condition was the way it was,
>> checking just 'havePin' seems sufficient and more correct to me. Am I
>> missing something?
> 
> I can't see anything either, quite odd.  Most likely explanation seems to be
> that something changed during the development of 7c75ef571579.
> 
> 
> Indeed, the first version of the patch from
>     https://postgr.es/m/CAE9k0Pm3KTx93K8_5j6VMzG4h5F%2BSyknxUwXrN-zqSZ9X8ZS3w%40mail.gmail.com
> was using "if (so->hashso_bucket_buf == so->currPos.buf)" both at the start
> and end of _hash_kill_items(). So likely it was just an accident during patch
> revisions.

Thanks for archeological excavation; pushed this second patch now.

- Heikki






^ permalink  raw  reply  [nested|flat] 4+ messages in thread


end of thread, other threads:[~2026-04-07 14:38 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-03-17 17:15 Assertion failure in hash_kill_items() Heikki Linnakangas <[email protected]>
2026-03-17 17:40 ` Andres Freund <[email protected]>
2026-03-17 20:50   ` Andres Freund <[email protected]>
2026-04-07 14:38   ` Heikki Linnakangas <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox