public inbox for [email protected]  
help / color / mirror / Atom feed
LockHasWaiters() crashes on fast-path locks
5+ messages / 2 participants
[nested] [flat]

* LockHasWaiters() crashes on fast-path locks
@ 2026-03-25 21:15 SATYANARAYANA NARLAPURAM <[email protected]>
  2026-03-25 22:06 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]>
  0 siblings, 1 reply; 5+ messages in thread

From: SATYANARAYANA NARLAPURAM @ 2026-03-25 21:15 UTC (permalink / raw)
  To: PostgreSQL Hackers <[email protected]>

Hi Hackers,

LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers
are populated, but this is not the case for locks acquired via the
fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on
relations may not be stored in the shared lock hash table, and the
LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case.

If LockHasWaiters() is called for such a lock, it dereferences those NULL
pointers when it reads proclock->holdMask and lock->waitMask, causing a
segfault.

The only existing caller is lazy_truncate_heap() in VACUUM, which queries
LockHasWaitersRelation(rel, AccessExclusiveLock). Since AccessExclusiveLock
is the strongest lock level, it is never fast-pathed, so the bug has never
been triggered in practice. However, any new caller that passes a weak lock
mode, for example, checking whether a DDL is waiting on an AccessShareLock
will crash. The fix is to transfer the lock to the main lock table before
we access them.

Attached a patch to address this issue.


Thanks,
Satya


Attachments:

  [application/octet-stream] 0001-lock-has-waiters-fast-path-fix.patch (829B, 3-0001-lock-has-waiters-fast-path-fix.patch)
  download | inline diff:
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index d930c66c..ff9b9caa 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -743,6 +743,20 @@ LockHasWaiters(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock)
 	 */
 	partitionLock = LockHashPartitionLock(locallock->hashcode);
 
+	/*
+	 * If the lock was acquired via the fast path, locallock->lock and
+	 * locallock->proclock will be NULL.  We must transfer the lock to the
+	 * main lock table before we can inspect LOCK->waitMask.
+	 */
+	if (locallock->lock == NULL)
+	{
+		PROCLOCK   *fp_proclock;
+
+		fp_proclock = FastPathGetRelationLockEntry(locallock);
+		locallock->lock = fp_proclock->tag.myLock;
+		locallock->proclock = fp_proclock;
+	}
+
 	LWLockAcquire(partitionLock, LW_SHARED);
 
 	/*


^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: LockHasWaiters() crashes on fast-path locks
  2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
@ 2026-03-25 22:06 ` Bharath Rupireddy <[email protected]>
  2026-03-25 22:34   ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
  0 siblings, 1 reply; 5+ messages in thread

From: Bharath Rupireddy @ 2026-03-25 22:06 UTC (permalink / raw)
  To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>

Hi,

On Wed, Mar 25, 2026 at 2:15 PM SATYANARAYANA NARLAPURAM
<[email protected]> wrote:
>
> Hi Hackers,
>
> LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers are populated, but this is not the case for locks acquired via the fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on relations may not be stored in the shared lock hash table, and the LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case.
>
> If LockHasWaiters() is called for such a lock, it dereferences those NULL pointers when it reads proclock->holdMask and lock->waitMask, causing a segfault.
>
> The only existing caller is lazy_truncate_heap() in VACUUM, which queries LockHasWaitersRelation(rel, AccessExclusiveLock). Since AccessExclusiveLock is the strongest lock level, it is never fast-pathed, so the bug has never been triggered in practice. However, any new caller that passes a weak lock mode, for example, checking whether a DDL is waiting on an AccessShareLock will crash. The fix is to transfer the lock to the main lock table before we access them.
>
> Attached a patch to address this issue.

Nice find! It would be good to add a test case (perhaps in an existing
test extension even though we may not commit it; it can act as a
demo).

I see that this type of lock transfer is happening for prepared
statements (see AtPrepare_Locks [1]). However, I see the proposed
patch relying on lock == NULL for detecting whether the lock was
acquired using fast-path. Although this looks correct because if the
lock or proclock pointers are NULL, this identifies that the lock was
taken using fast-path. But for consistency purposes, can we have the
same check as that of AtPrepare_Locks?

[1]
/*
* If the local lock was taken via the fast-path, we need to move it
* to the primary lock table, or just get a pointer to the existing
* primary lock table entry if by chance it's already been
* transferred.
*/
if (locallock->proclock == NULL)

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: LockHasWaiters() crashes on fast-path locks
  2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
  2026-03-25 22:06 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]>
@ 2026-03-25 22:34   ` SATYANARAYANA NARLAPURAM <[email protected]>
  2026-03-26 21:53     ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
  0 siblings, 1 reply; 5+ messages in thread

From: SATYANARAYANA NARLAPURAM @ 2026-03-25 22:34 UTC (permalink / raw)
  To: Bharath Rupireddy <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>

Hi,

On Wed, Mar 25, 2026 at 3:06 PM Bharath Rupireddy <
[email protected]> wrote:

> Hi,
>
> On Wed, Mar 25, 2026 at 2:15 PM SATYANARAYANA NARLAPURAM
> <[email protected]> wrote:
> >
> > Hi Hackers,
> >
> > LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers
> are populated, but this is not the case for locks acquired via the
> fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on
> relations may not be stored in the shared lock hash table, and the
> LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case.
> >
> > If LockHasWaiters() is called for such a lock, it dereferences those
> NULL pointers when it reads proclock->holdMask and lock->waitMask, causing
> a segfault.
> >
> > The only existing caller is lazy_truncate_heap() in VACUUM, which
> queries LockHasWaitersRelation(rel, AccessExclusiveLock). Since
> AccessExclusiveLock is the strongest lock level, it is never fast-pathed,
> so the bug has never been triggered in practice. However, any new caller
> that passes a weak lock mode, for example, checking whether a DDL is
> waiting on an AccessShareLock will crash. The fix is to transfer the lock
> to the main lock table before we access them.
> >
> > Attached a patch to address this issue.
>
> Nice find! It would be good to add a test case (perhaps in an existing
> test extension even though we may not commit it; it can act as a
> demo).
>

Please refer the patches in the thread [2] below for a repro / use case.


>
> I see that this type of lock transfer is happening for prepared
> statements (see AtPrepare_Locks [1]). However, I see the proposed
> patch relying on lock == NULL for detecting whether the lock was
> acquired using fast-path. Although this looks correct because if the
> lock or proclock pointers are NULL, this identifies that the lock was
> taken using fast-path. But for consistency purposes, can we have the
> same check as that of AtPrepare_Locks?


Thank you for the review and code pointer, this is addressed now in v2
patch, attached.

[2]
https://www.postgresql.org/message-id/CAHg%2BQDfdoR%3D7iqEAvLW9qtzV0Sx1wp2FuALeamqcCdiVEmMF-Q%40mail...


Thanks,
Satya


Attachments:

  [application/octet-stream] v2-0001-lock-has-waiters-fast-path-fix.patch (795B, 3-v2-0001-lock-has-waiters-fast-path-fix.patch)
  download | inline diff:
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index d930c66c..3f4d343d 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -743,6 +743,17 @@ LockHasWaiters(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock)
 	 */
 	partitionLock = LockHashPartitionLock(locallock->hashcode);
 
+	/*
+	 * If the local lock was taken via the fast-path, we need to move it
+	 * to the primary lock table, or just get a pointer to the existing
+	 * primary lock table entry if by chance it's already been transferred.
+	 */
+	if (locallock->proclock == NULL)
+	{
+		locallock->proclock = FastPathGetRelationLockEntry(locallock);
+		locallock->lock = locallock->proclock->tag.myLock;
+	}
+
 	LWLockAcquire(partitionLock, LW_SHARED);
 
 	/*


^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: LockHasWaiters() crashes on fast-path locks
  2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
  2026-03-25 22:06 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]>
  2026-03-25 22:34   ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
@ 2026-03-26 21:53     ` SATYANARAYANA NARLAPURAM <[email protected]>
  2026-03-26 23:05       ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]>
  0 siblings, 1 reply; 5+ messages in thread

From: SATYANARAYANA NARLAPURAM @ 2026-03-26 21:53 UTC (permalink / raw)
  To: Bharath Rupireddy <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>

Updated the patch with a commit message.

On Wed, Mar 25, 2026 at 3:34 PM SATYANARAYANA NARLAPURAM <
[email protected]> wrote:

> Hi,
>
> On Wed, Mar 25, 2026 at 3:06 PM Bharath Rupireddy <
> [email protected]> wrote:
>
>> Hi,
>>
>> On Wed, Mar 25, 2026 at 2:15 PM SATYANARAYANA NARLAPURAM
>> <[email protected]> wrote:
>> >
>> > Hi Hackers,
>> >
>> > LockHasWaiters() assumes that the LOCALLOCK's lock and proclock
>> pointers are populated, but this is not the case for locks acquired via the
>> fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on
>> relations may not be stored in the shared lock hash table, and the
>> LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case.
>> >
>> > If LockHasWaiters() is called for such a lock, it dereferences those
>> NULL pointers when it reads proclock->holdMask and lock->waitMask, causing
>> a segfault.
>> >
>> > The only existing caller is lazy_truncate_heap() in VACUUM, which
>> queries LockHasWaitersRelation(rel, AccessExclusiveLock). Since
>> AccessExclusiveLock is the strongest lock level, it is never fast-pathed,
>> so the bug has never been triggered in practice. However, any new caller
>> that passes a weak lock mode, for example, checking whether a DDL is
>> waiting on an AccessShareLock will crash. The fix is to transfer the lock
>> to the main lock table before we access them.
>> >
>> > Attached a patch to address this issue.
>>
>> Nice find! It would be good to add a test case (perhaps in an existing
>> test extension even though we may not commit it; it can act as a
>> demo).
>>
>
> Please refer the patches in the thread [2] below for a repro / use case.
>
>
>>
>> I see that this type of lock transfer is happening for prepared
>> statements (see AtPrepare_Locks [1]). However, I see the proposed
>> patch relying on lock == NULL for detecting whether the lock was
>> acquired using fast-path. Although this looks correct because if the
>> lock or proclock pointers are NULL, this identifies that the lock was
>> taken using fast-path. But for consistency purposes, can we have the
>> same check as that of AtPrepare_Locks?
>
>
> Thank you for the review and code pointer, this is addressed now in v2
> patch, attached.
>
> [2]
> https://www.postgresql.org/message-id/CAHg%2BQDfdoR%3D7iqEAvLW9qtzV0Sx1wp2FuALeamqcCdiVEmMF-Q%40mail...
>
>
> Thanks,
> Satya
>


Attachments:

  [application/octet-stream] v3-0001-Fix-LockHasWaiters-crash-for-fast-path-locks.patch (1.8K, 3-v3-0001-Fix-LockHasWaiters-crash-for-fast-path-locks.patch)
  download | inline diff:
From 349f3fa56abda1f9411400ae9a05efabcb2270e8 Mon Sep 17 00:00:00 2001
From: Satya Narlapuram <[email protected]>
Date: Thu, 26 Mar 2026 21:42:15 +0000
Subject: [PATCH] Fix LockHasWaiters() crash for fast-path locks

LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers
are populated, but this is not the case for locks acquired via the
fast-path optimization.  Weak relation locks (< ShareUpdateExclusiveLock),
including AccessShareLock, are not stored in the shared lock hash table,
leaving the LOCALLOCK entry with lock = NULL and proclock = NULL.

If LockHasWaiters() is called for such a lock, it dereferences those NULL
pointers when reading proclock->holdMask and lock->waitMask, causing a
segfault.


Fix by checking whether the LOCALLOCK has been populated and, if not,
calling FastPathGetRelationLockEntry() to transfer the lock from the
fast-path arrays into the main lock table.
---
 src/backend/storage/lmgr/lock.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index d930c66c..3f4d343d 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -743,6 +743,17 @@ LockHasWaiters(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock)
 	 */
 	partitionLock = LockHashPartitionLock(locallock->hashcode);
 
+	/*
+	 * If the local lock was taken via the fast-path, we need to move it
+	 * to the primary lock table, or just get a pointer to the existing
+	 * primary lock table entry if by chance it's already been transferred.
+	 */
+	if (locallock->proclock == NULL)
+	{
+		locallock->proclock = FastPathGetRelationLockEntry(locallock);
+		locallock->lock = locallock->proclock->tag.myLock;
+	}
+
 	LWLockAcquire(partitionLock, LW_SHARED);
 
 	/*
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: LockHasWaiters() crashes on fast-path locks
  2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
  2026-03-25 22:06 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]>
  2026-03-25 22:34   ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
  2026-03-26 21:53     ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
@ 2026-03-26 23:05       ` Bharath Rupireddy <[email protected]>
  0 siblings, 0 replies; 5+ messages in thread

From: Bharath Rupireddy @ 2026-03-26 23:05 UTC (permalink / raw)
  To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>

Hi,

On Thu, Mar 26, 2026 at 2:54 PM SATYANARAYANA NARLAPURAM
<[email protected]> wrote:
>
> Updated the patch with a commit message.

Thanks for sending the updated patch. It looks good to me. I verified
it with the other thread patch - it fixes the SEGV.

[1] https://www.postgresql.org/message-id/CAHg%2BQDfdoR%3D7iqEAvLW9qtzV0Sx1wp2FuALeamqcCdiVEmMF-Q%40mail...

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 5+ messages in thread


end of thread, other threads:[~2026-03-26 23:05 UTC | newest]

Thread overview: 5+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]>
2026-03-25 22:06 ` Bharath Rupireddy <[email protected]>
2026-03-25 22:34   ` SATYANARAYANA NARLAPURAM <[email protected]>
2026-03-26 21:53     ` SATYANARAYANA NARLAPURAM <[email protected]>
2026-03-26 23:05       ` Bharath Rupireddy <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox