public inbox for [email protected]help / color / mirror / Atom feed
LockHasWaiters() crashes on fast-path locks 5+ messages / 2 participants [nested] [flat]
* LockHasWaiters() crashes on fast-path locks @ 2026-03-25 21:15 SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-25 22:06 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 5+ messages in thread From: SATYANARAYANA NARLAPURAM @ 2026-03-25 21:15 UTC (permalink / raw) To: PostgreSQL Hackers <[email protected]> Hi Hackers, LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers are populated, but this is not the case for locks acquired via the fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on relations may not be stored in the shared lock hash table, and the LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case. If LockHasWaiters() is called for such a lock, it dereferences those NULL pointers when it reads proclock->holdMask and lock->waitMask, causing a segfault. The only existing caller is lazy_truncate_heap() in VACUUM, which queries LockHasWaitersRelation(rel, AccessExclusiveLock). Since AccessExclusiveLock is the strongest lock level, it is never fast-pathed, so the bug has never been triggered in practice. However, any new caller that passes a weak lock mode, for example, checking whether a DDL is waiting on an AccessShareLock will crash. The fix is to transfer the lock to the main lock table before we access them. Attached a patch to address this issue. Thanks, Satya Attachments: [application/octet-stream] 0001-lock-has-waiters-fast-path-fix.patch (829B, 3-0001-lock-has-waiters-fast-path-fix.patch) download | inline diff: diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c index d930c66c..ff9b9caa 100644 --- a/src/backend/storage/lmgr/lock.c +++ b/src/backend/storage/lmgr/lock.c @@ -743,6 +743,20 @@ LockHasWaiters(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock) */ partitionLock = LockHashPartitionLock(locallock->hashcode); + /* + * If the lock was acquired via the fast path, locallock->lock and + * locallock->proclock will be NULL. We must transfer the lock to the + * main lock table before we can inspect LOCK->waitMask. + */ + if (locallock->lock == NULL) + { + PROCLOCK *fp_proclock; + + fp_proclock = FastPathGetRelationLockEntry(locallock); + locallock->lock = fp_proclock->tag.myLock; + locallock->proclock = fp_proclock; + } + LWLockAcquire(partitionLock, LW_SHARED); /* ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: LockHasWaiters() crashes on fast-path locks 2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> @ 2026-03-25 22:06 ` Bharath Rupireddy <[email protected]> 2026-03-25 22:34 ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> 0 siblings, 1 reply; 5+ messages in thread From: Bharath Rupireddy @ 2026-03-25 22:06 UTC (permalink / raw) To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> Hi, On Wed, Mar 25, 2026 at 2:15 PM SATYANARAYANA NARLAPURAM <[email protected]> wrote: > > Hi Hackers, > > LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers are populated, but this is not the case for locks acquired via the fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on relations may not be stored in the shared lock hash table, and the LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case. > > If LockHasWaiters() is called for such a lock, it dereferences those NULL pointers when it reads proclock->holdMask and lock->waitMask, causing a segfault. > > The only existing caller is lazy_truncate_heap() in VACUUM, which queries LockHasWaitersRelation(rel, AccessExclusiveLock). Since AccessExclusiveLock is the strongest lock level, it is never fast-pathed, so the bug has never been triggered in practice. However, any new caller that passes a weak lock mode, for example, checking whether a DDL is waiting on an AccessShareLock will crash. The fix is to transfer the lock to the main lock table before we access them. > > Attached a patch to address this issue. Nice find! It would be good to add a test case (perhaps in an existing test extension even though we may not commit it; it can act as a demo). I see that this type of lock transfer is happening for prepared statements (see AtPrepare_Locks [1]). However, I see the proposed patch relying on lock == NULL for detecting whether the lock was acquired using fast-path. Although this looks correct because if the lock or proclock pointers are NULL, this identifies that the lock was taken using fast-path. But for consistency purposes, can we have the same check as that of AtPrepare_Locks? [1] /* * If the local lock was taken via the fast-path, we need to move it * to the primary lock table, or just get a pointer to the existing * primary lock table entry if by chance it's already been * transferred. */ if (locallock->proclock == NULL) -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: LockHasWaiters() crashes on fast-path locks 2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-25 22:06 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]> @ 2026-03-25 22:34 ` SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-26 21:53 ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> 0 siblings, 1 reply; 5+ messages in thread From: SATYANARAYANA NARLAPURAM @ 2026-03-25 22:34 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> Hi, On Wed, Mar 25, 2026 at 3:06 PM Bharath Rupireddy < [email protected]> wrote: > Hi, > > On Wed, Mar 25, 2026 at 2:15 PM SATYANARAYANA NARLAPURAM > <[email protected]> wrote: > > > > Hi Hackers, > > > > LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers > are populated, but this is not the case for locks acquired via the > fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on > relations may not be stored in the shared lock hash table, and the > LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case. > > > > If LockHasWaiters() is called for such a lock, it dereferences those > NULL pointers when it reads proclock->holdMask and lock->waitMask, causing > a segfault. > > > > The only existing caller is lazy_truncate_heap() in VACUUM, which > queries LockHasWaitersRelation(rel, AccessExclusiveLock). Since > AccessExclusiveLock is the strongest lock level, it is never fast-pathed, > so the bug has never been triggered in practice. However, any new caller > that passes a weak lock mode, for example, checking whether a DDL is > waiting on an AccessShareLock will crash. The fix is to transfer the lock > to the main lock table before we access them. > > > > Attached a patch to address this issue. > > Nice find! It would be good to add a test case (perhaps in an existing > test extension even though we may not commit it; it can act as a > demo). > Please refer the patches in the thread [2] below for a repro / use case. > > I see that this type of lock transfer is happening for prepared > statements (see AtPrepare_Locks [1]). However, I see the proposed > patch relying on lock == NULL for detecting whether the lock was > acquired using fast-path. Although this looks correct because if the > lock or proclock pointers are NULL, this identifies that the lock was > taken using fast-path. But for consistency purposes, can we have the > same check as that of AtPrepare_Locks? Thank you for the review and code pointer, this is addressed now in v2 patch, attached. [2] https://www.postgresql.org/message-id/CAHg%2BQDfdoR%3D7iqEAvLW9qtzV0Sx1wp2FuALeamqcCdiVEmMF-Q%40mail... Thanks, Satya Attachments: [application/octet-stream] v2-0001-lock-has-waiters-fast-path-fix.patch (795B, 3-v2-0001-lock-has-waiters-fast-path-fix.patch) download | inline diff: diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c index d930c66c..3f4d343d 100644 --- a/src/backend/storage/lmgr/lock.c +++ b/src/backend/storage/lmgr/lock.c @@ -743,6 +743,17 @@ LockHasWaiters(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock) */ partitionLock = LockHashPartitionLock(locallock->hashcode); + /* + * If the local lock was taken via the fast-path, we need to move it + * to the primary lock table, or just get a pointer to the existing + * primary lock table entry if by chance it's already been transferred. + */ + if (locallock->proclock == NULL) + { + locallock->proclock = FastPathGetRelationLockEntry(locallock); + locallock->lock = locallock->proclock->tag.myLock; + } + LWLockAcquire(partitionLock, LW_SHARED); /* ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: LockHasWaiters() crashes on fast-path locks 2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-25 22:06 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]> 2026-03-25 22:34 ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> @ 2026-03-26 21:53 ` SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-26 23:05 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 5+ messages in thread From: SATYANARAYANA NARLAPURAM @ 2026-03-26 21:53 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> Updated the patch with a commit message. On Wed, Mar 25, 2026 at 3:34 PM SATYANARAYANA NARLAPURAM < [email protected]> wrote: > Hi, > > On Wed, Mar 25, 2026 at 3:06 PM Bharath Rupireddy < > [email protected]> wrote: > >> Hi, >> >> On Wed, Mar 25, 2026 at 2:15 PM SATYANARAYANA NARLAPURAM >> <[email protected]> wrote: >> > >> > Hi Hackers, >> > >> > LockHasWaiters() assumes that the LOCALLOCK's lock and proclock >> pointers are populated, but this is not the case for locks acquired via the >> fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on >> relations may not be stored in the shared lock hash table, and the >> LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case. >> > >> > If LockHasWaiters() is called for such a lock, it dereferences those >> NULL pointers when it reads proclock->holdMask and lock->waitMask, causing >> a segfault. >> > >> > The only existing caller is lazy_truncate_heap() in VACUUM, which >> queries LockHasWaitersRelation(rel, AccessExclusiveLock). Since >> AccessExclusiveLock is the strongest lock level, it is never fast-pathed, >> so the bug has never been triggered in practice. However, any new caller >> that passes a weak lock mode, for example, checking whether a DDL is >> waiting on an AccessShareLock will crash. The fix is to transfer the lock >> to the main lock table before we access them. >> > >> > Attached a patch to address this issue. >> >> Nice find! It would be good to add a test case (perhaps in an existing >> test extension even though we may not commit it; it can act as a >> demo). >> > > Please refer the patches in the thread [2] below for a repro / use case. > > >> >> I see that this type of lock transfer is happening for prepared >> statements (see AtPrepare_Locks [1]). However, I see the proposed >> patch relying on lock == NULL for detecting whether the lock was >> acquired using fast-path. Although this looks correct because if the >> lock or proclock pointers are NULL, this identifies that the lock was >> taken using fast-path. But for consistency purposes, can we have the >> same check as that of AtPrepare_Locks? > > > Thank you for the review and code pointer, this is addressed now in v2 > patch, attached. > > [2] > https://www.postgresql.org/message-id/CAHg%2BQDfdoR%3D7iqEAvLW9qtzV0Sx1wp2FuALeamqcCdiVEmMF-Q%40mail... > > > Thanks, > Satya > Attachments: [application/octet-stream] v3-0001-Fix-LockHasWaiters-crash-for-fast-path-locks.patch (1.8K, 3-v3-0001-Fix-LockHasWaiters-crash-for-fast-path-locks.patch) download | inline diff: From 349f3fa56abda1f9411400ae9a05efabcb2270e8 Mon Sep 17 00:00:00 2001 From: Satya Narlapuram <[email protected]> Date: Thu, 26 Mar 2026 21:42:15 +0000 Subject: [PATCH] Fix LockHasWaiters() crash for fast-path locks LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers are populated, but this is not the case for locks acquired via the fast-path optimization. Weak relation locks (< ShareUpdateExclusiveLock), including AccessShareLock, are not stored in the shared lock hash table, leaving the LOCALLOCK entry with lock = NULL and proclock = NULL. If LockHasWaiters() is called for such a lock, it dereferences those NULL pointers when reading proclock->holdMask and lock->waitMask, causing a segfault. Fix by checking whether the LOCALLOCK has been populated and, if not, calling FastPathGetRelationLockEntry() to transfer the lock from the fast-path arrays into the main lock table. --- src/backend/storage/lmgr/lock.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c index d930c66c..3f4d343d 100644 --- a/src/backend/storage/lmgr/lock.c +++ b/src/backend/storage/lmgr/lock.c @@ -743,6 +743,17 @@ LockHasWaiters(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock) */ partitionLock = LockHashPartitionLock(locallock->hashcode); + /* + * If the local lock was taken via the fast-path, we need to move it + * to the primary lock table, or just get a pointer to the existing + * primary lock table entry if by chance it's already been transferred. + */ + if (locallock->proclock == NULL) + { + locallock->proclock = FastPathGetRelationLockEntry(locallock); + locallock->lock = locallock->proclock->tag.myLock; + } + LWLockAcquire(partitionLock, LW_SHARED); /* -- 2.43.0 ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: LockHasWaiters() crashes on fast-path locks 2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-25 22:06 ` Re: LockHasWaiters() crashes on fast-path locks Bharath Rupireddy <[email protected]> 2026-03-25 22:34 ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-26 21:53 ` Re: LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> @ 2026-03-26 23:05 ` Bharath Rupireddy <[email protected]> 0 siblings, 0 replies; 5+ messages in thread From: Bharath Rupireddy @ 2026-03-26 23:05 UTC (permalink / raw) To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> Hi, On Thu, Mar 26, 2026 at 2:54 PM SATYANARAYANA NARLAPURAM <[email protected]> wrote: > > Updated the patch with a commit message. Thanks for sending the updated patch. It looks good to me. I verified it with the other thread patch - it fixes the SEGV. [1] https://www.postgresql.org/message-id/CAHg%2BQDfdoR%3D7iqEAvLW9qtzV0Sx1wp2FuALeamqcCdiVEmMF-Q%40mail... -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 5+ messages in thread
end of thread, other threads:[~2026-03-26 23:05 UTC | newest] Thread overview: 5+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-03-25 21:15 LockHasWaiters() crashes on fast-path locks SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-25 22:06 ` Bharath Rupireddy <[email protected]> 2026-03-25 22:34 ` SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-26 21:53 ` SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-26 23:05 ` Bharath Rupireddy <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox