public inbox for [email protected]  
help / color / mirror / Atom feed
From: Alexander Korotkov <[email protected]>
To: [email protected]
Subject: pgsql: Fix memory ordering in WAIT FOR LSN wakeup mechanism
Date: Sun, 03 May 2026 13:22:22 +0000
Message-ID: <[email protected]> (raw)

Fix memory ordering in WAIT FOR LSN wakeup mechanism

WAIT FOR LSN uses a Dekker-style handshake: the waker stores an LSN
position then reads minWaitedLSN; the waiter stores its target into
minWaitedLSN then reads the position.  Without a barrier between each
side's store and load, a CPU may satisfy the load before the store
becomes globally visible, causing either side to miss a concurrent
update.  The result is a missed wakeup: the waiter sleeps indefinitely
until the next unrelated event.

Fix by embedding the required barriers into the atomic operations on
minWaitedLSN:

- In updateMinWaitedLSN(), use pg_atomic_write_membarrier_u64() so the
  waiter's preceding heap update is visible before the new minWaitedLSN
  value is published.

- In WaitLSNWakeup(), use pg_atomic_read_membarrier_u64() in the
  fast-path check so the waker's preceding position store is globally
  visible before minWaitedLSN is read.

The waiter side is also covered by the barrier semantics already present
in GetCurrentLSNForWaitType(): GetWalRcvWriteRecPtr() uses an explicit
read barrier (from patch 0001), while the remaining getters acquire a
spinlock, which implies the same ordering.

Also call ResetLatch() unconditionally after WaitLatch(), following the
standard latch loop pattern.  WaitLatch() does not guarantee that all
simultaneously true wake conditions are reported in one return, so a
timeout can race with SetLatch().  If we skip ResetLatch() on a timeout
return, the code performs further asynchronous-state checks before
consuming the latch, violating the latch API's required wait/reset
pattern.  That can leave the latch set across loop exit and cause a
later unrelated WaitLatch() in the same backend to return immediately.

Reported-by: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/zqbppucpmkeqecfy4s5kscnru4tbk6khp3ozqz6ad2zijz354k%40w4bdf4z3wqoz
Author: Xuneng Zhou <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Alexander Korotkov <[email protected]>

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/a80a593ab63696a0ad0e5c10b9e1b99aaa98032e

Modified Files
--------------
src/backend/access/transam/xlogwait.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)



reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: pgsql: Fix memory ordering in WAIT FOR LSN wakeup mechanism
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox