public inbox for [email protected]  
help / color / mirror / Atom feed
From: Xuneng Zhou <[email protected]>
To: Zhijie Hou (Fujitsu) <[email protected]>
Cc: Fujii Masao <[email protected]>
Cc: Srinath Reddy Sadipiralla <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Fix race in ReplicationSlotRelease for ephemeral slots
Date: Sat, 6 Jun 2026 21:25:39 +0800
Message-ID: <CABPTF7W+rDN6f_sAvryvJTf2hSEQNqXBbADN2oQknoimcFwBSw@mail.gmail.com> (raw)
In-Reply-To: <TY4PR01MB1771887D33612C5A45F7E9CDF941E2@TY4PR01MB17718.jpnprd01.prod.outlook.com>
References: <TY4PR01MB177184FF9EE916F577E1F554194082@TY4PR01MB17718.jpnprd01.prod.outlook.com>
	<CAFC+b6o-hD5VxVLZQovmHSYykF8Qzq3eiuBU-U1F_yR9-y6P_w@mail.gmail.com>
	<TY4PR01MB177180A7CE60BCDF286B1C6F594172@TY4PR01MB17718.jpnprd01.prod.outlook.com>
	<CABPTF7VyH1-W2xnDspECDEzFGQj=WTFpZBCqKfM11OAZa6gQHQ@mail.gmail.com>
	<CAHGQGwE+2WSqiAYgNJRkf_twdB+uRGozjjGhUn76vUKZ8dzbSA@mail.gmail.com>
	<CABPTF7VeA8szPv7LYDVY9_7LftV-HM8NFVQR2natPKmr73JW+A@mail.gmail.com>
	<TY4PR01MB1771887D33612C5A45F7E9CDF941E2@TY4PR01MB17718.jpnprd01.prod.outlook.com>

On Sat, Jun 6, 2026 at 5:35 PM Zhijie Hou (Fujitsu)
<[email protected]> wrote:
>
> On Friday, June 5, 2026 8:45 PM Xuneng Zhou <[email protected]> wrote:
> > On Wed, Jun 3, 2026 at 8:03 PM Fujii Masao <[email protected]> wrote:
> > >
> > > On Tue, Jun 2, 2026 at 3:00 PM Xuneng Zhou <[email protected]>
> > wrote:
> > >
> > >   /* Drop the local slot if it is not required to be retained. */
> > >   if (!local_sync_slot_required(local_slot, remote_slot_list))
> > >   {
> > > + bool dropped = false;
> > > + NameData slot_name = {0};
> > > + Oid slot_database = local_slot->data.database;
> > >   bool synced_slot;
> > >
> > > Is it really safe to read slot_database before acquiring the database lock?
> >
> > Reading slot_database before taking the database lock seems not
> > inherently unsafe by itself. The comment suggests that the lock is
> > primarily used to prevent conflicts with the startup process running
> > ReplicationSlotsDropDBSlots() during db-drop replay; it does not
> > protect replication slot array reuse.
> >
> > The unsafe part could be reading slot_database from local_slot after
> > ReplicationSlotControlLock has been released. At this point, the slot
> > array cell may already have been freed and reused, so the value read
> > may no longer belong to the slot that get_local_synced_slots()
> > originally collected. As a result, we could end up locking the wrong
> > database.
> >
> > There seems to be two related issues:
> >
> > 1) Before drop: reading local_slot->data.database /
> > local_slot->data.name after the slot-array lock was released, before
> > verifying the cell still represents the same synced slot.
>
> I recall condition (1) is considered acceptable, since the database lock is
> released immediately after re-verifying that the slot is no longer the original
> 'synced' one anyway. Additionally, this race can only occur when replaying a
> DROP DATABASE, which is rare in practice. Since we only take a shared lock, it
> does not seem to cause real issues.
>
> >
> > 2) After drop: calling ReplicationSlotDropAcquired(false) and then
> > reading local_slot->data.database / local_slot->data.name for
> > unlocking/logging after the cell may have been reused by another
> > backend.
>
> Right. I missed this race condition during implementation and agree it's a real
> issue. An even bigger problem is that we could release a lock on the wrong
> database if the slot entry is reused after being dropped. I think we should fix
> this by saving the database OID at the beginning, as your patch does.
>
> >
> > The prior patch targets to fix 2), but leaves 1) unfixed.
> >
> > > BTW, I'm also wondering whether it's safe for
> > > local_sync_slot_required() to access local_slot without holding a lock.
> >
> > I share the same concern here. The issue smells similar to the one
> > discussed above -- reading values from the array cell directly without
> > the protection of array lock.
> >
> > To solve them altogether, it might be
> > better to stop carrying raw ReplicationSlot * values across
> > unprotected windows. We can get the snapshot values such as slot name
> > & database oid from get_local_synced_slots() instead. Then
> > local_sync_slot_required() works from the snapshot, and
> > drop_local_obsolete_slots() uses the copied database OID to take the
> > database lock. Before dropping, it should revalidate by copied
> > name/database that the slot is still a synced logical slot, then
> > acquire/drop by copied name, and use only copied values for
> > unlock/logging. I plan to prepare a refactoring patch for this. Does
> > that seem like the right direction to you?
>
> Saving only the name and database OID would force us to search for the slot
> again in the loop, which was one reason we didn't implement it that way.

The extra search may not be ideal, especially for the worker with a
large number of slots to sync. But the cost could be avoided with an
extra field like slotno. Were there other blocking issues discussed
before?


-- 
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.






view thread (27+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Fix race in ReplicationSlotRelease for ephemeral slots
  In-Reply-To: <CABPTF7W+rDN6f_sAvryvJTf2hSEQNqXBbADN2oQknoimcFwBSw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox