public inbox for [email protected]help / color / mirror / Atom feed
[PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() 3+ messages / 2 participants [nested] [flat]
* [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() @ 2026-04-17 06:50 JoongHyuk Shin <[email protected]> 2026-04-17 06:59 ` RE: [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() Zhijie Hou (Fujitsu) <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: JoongHyuk Shin @ 2026-04-17 06:50 UTC (permalink / raw) To: [email protected] Hi, Commit 2a5225b99d7 fixed a race in ReplicationSlotsComputeRequiredXmin() where ReplicationSlotControlLock was released before the global xmin update, allowing a concurrent backend to overwrite a correct value with a stale one. ReplicationSlotsComputeRequiredLSN() has the same problem, it releases the lock before calling XLogSetReplicationSlotMinimumLSN(), so a stale minimum LSN can overwrite a correct (lower) one, potentially leading to premature WAL removal. The attached patch moves LWLockRelease() to after the LSN update, matching the xmin fix. Since 2a5225b99d7 was backpatched to all supported versions, I believe this should be as well. Attachments: [application/octet-stream] 0001-Fix-TOCTOU-race-in-ReplicationSlotsComputeRequiredLS.patch (2.0K, 3-0001-Fix-TOCTOU-race-in-ReplicationSlotsComputeRequiredLS.patch) download | inline diff: From 63c3b469d8e3addbe092861b06fb251afa4ac32a Mon Sep 17 00:00:00 2001 From: JoongHyuk Shin <[email protected]> Date: Fri, 17 Apr 2026 15:07:11 +0900 Subject: [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() ReplicationSlotsComputeRequiredLSN() released ReplicationSlotControlLock before calling XLogSetReplicationSlotMinimumLSN(), creating a window where a concurrent backend could compute a correct (lower) minimum LSN, only for the first backend to overwrite it with a stale (higher) value. This could lead to premature WAL removal. The exact same pattern was already fixed for the xmin variant in commit 2a5225b99d7, which moved the lock release in ReplicationSlotsComputeRequiredXmin() to after the global xmin update. The LSN function was missed in that fix. Move LWLockRelease() to after XLogSetReplicationSlotMinimumLSN() so the lock is held for the entire compute-and-update sequence, matching the xmin function's behavior. Author: JoongHyuk Shin <[email protected]> --- src/backend/replication/slot.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 83fcde74718..0d87f0fd39d 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -1312,6 +1312,11 @@ ReplicationSlotsComputeRequiredLSN(void) Assert(ReplicationSlotCtl != NULL); + /* + * Hold ReplicationSlotControlLock until after updating the minimum LSN. + * Without this, a concurrent backend could compute a correct (lower) + * minimum and then have it overwritten by our stale (higher) value. + */ LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); for (i = 0; i < max_replication_slots + max_repack_replication_slots; i++) { @@ -1357,9 +1362,9 @@ ReplicationSlotsComputeRequiredLSN(void) restart_lsn < min_required)) min_required = restart_lsn; } - LWLockRelease(ReplicationSlotControlLock); XLogSetReplicationSlotMinimumLSN(min_required); + LWLockRelease(ReplicationSlotControlLock); } /* -- 2.52.0 ^ permalink raw reply [nested|flat] 3+ messages in thread
* RE: [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() 2026-04-17 06:50 [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() JoongHyuk Shin <[email protected]> @ 2026-04-17 06:59 ` Zhijie Hou (Fujitsu) <[email protected]> 2026-04-17 07:46 ` Re: [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() JoongHyuk Shin <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: Zhijie Hou (Fujitsu) @ 2026-04-17 06:59 UTC (permalink / raw) To: JoongHyuk Shin <[email protected]>; +Cc: [email protected] <[email protected]> On Friday, April 17, 2026 2:50 PM JoongHyuk Shin <[email protected]> wrote: > Commit 2a5225b99d7 fixed a race in ReplicationSlotsComputeRequiredXmin() > where ReplicationSlotControlLock was released before the global xmin > update, allowing a concurrent backend to overwrite a correct value with > a stale one. > > ReplicationSlotsComputeRequiredLSN() has the same problem, > it releases the lock before calling XLogSetReplicationSlotMinimumLSN(), > so a stale minimum LSN can overwrite a correct (lower) one, > potentially leading to premature WAL removal. > > The attached patch moves LWLockRelease() to after the LSN update, > matching the xmin fix. > Since 2a5225b99d7 was backpatched to all supported versions, > I believe this should be as well. Thanks for noticing this. There is an existing thread [1] that I started following 2a5225b99d7 to address the same issue. The patch you posted only increases the lock scope in ReplicationSlotsComputeRequiredLSN() but does not increase the lock level when reserving WALs, so I think it would not fix the issue. Please feel free to review the patch in that thread if you find it helpful. [1] https://commitfest.postgresql.org/patch/6451/ Best Regards, Hou zj ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() 2026-04-17 06:50 [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() JoongHyuk Shin <[email protected]> 2026-04-17 06:59 ` RE: [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() Zhijie Hou (Fujitsu) <[email protected]> @ 2026-04-17 07:46 ` JoongHyuk Shin <[email protected]> 0 siblings, 0 replies; 3+ messages in thread From: JoongHyuk Shin @ 2026-04-17 07:46 UTC (permalink / raw) To: Zhijie Hou (Fujitsu) <[email protected]>; +Cc: [email protected] <[email protected]> Hi Zhijie, Thanks for pointing this out. You're right that my patch only covers the read side and misses the write side in ReplicationSlotReserveWal(). I should have checked for existing work before submitting. I'll withdraw this patch. Best Regards, JoongHyuk Shin On Fri, Apr 17, 2026 at 3:59 PM Zhijie Hou (Fujitsu) <[email protected]> wrote: > On Friday, April 17, 2026 2:50 PM JoongHyuk Shin <[email protected]> > wrote: > > Commit 2a5225b99d7 fixed a race in ReplicationSlotsComputeRequiredXmin() > > where ReplicationSlotControlLock was released before the global xmin > > update, allowing a concurrent backend to overwrite a correct value with > > a stale one. > > > > ReplicationSlotsComputeRequiredLSN() has the same problem, > > it releases the lock before calling XLogSetReplicationSlotMinimumLSN(), > > so a stale minimum LSN can overwrite a correct (lower) one, > > potentially leading to premature WAL removal. > > > > The attached patch moves LWLockRelease() to after the LSN update, > > matching the xmin fix. > > Since 2a5225b99d7 was backpatched to all supported versions, > > I believe this should be as well. > > Thanks for noticing this. There is an existing thread [1] that I started > following 2a5225b99d7 to address the same issue. The patch you posted > only increases the lock scope in ReplicationSlotsComputeRequiredLSN() but > does > not increase the lock level when reserving WALs, so I think it would not > fix the issue. > > Please feel free to review the patch in that thread if you find it helpful. > > [1] https://commitfest.postgresql.org/patch/6451/ > > Best Regards, > Hou zj > ^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2026-04-17 07:46 UTC | newest] Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-04-17 06:50 [PATCH] Fix TOCTOU race in ReplicationSlotsComputeRequiredLSN() JoongHyuk Shin <[email protected]> 2026-04-17 06:59 ` Zhijie Hou (Fujitsu) <[email protected]> 2026-04-17 07:46 ` JoongHyuk Shin <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox