public inbox for [email protected]  
help / color / mirror / Atom feed
From: Fujii Masao <[email protected]>
To: Xuneng Zhou <[email protected]>
Cc: Zhijie Hou (Fujitsu) <[email protected]>
Cc: Srinath Reddy Sadipiralla <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Fix race in ReplicationSlotRelease for ephemeral slots
Date: Wed, 3 Jun 2026 21:03:39 +0900
Message-ID: <CAHGQGwE+2WSqiAYgNJRkf_twdB+uRGozjjGhUn76vUKZ8dzbSA@mail.gmail.com> (raw)
In-Reply-To: <CABPTF7VyH1-W2xnDspECDEzFGQj=WTFpZBCqKfM11OAZa6gQHQ@mail.gmail.com>
References: <TY4PR01MB177184FF9EE916F577E1F554194082@TY4PR01MB17718.jpnprd01.prod.outlook.com>
	<CAFC+b6o-hD5VxVLZQovmHSYykF8Qzq3eiuBU-U1F_yR9-y6P_w@mail.gmail.com>
	<TY4PR01MB177180A7CE60BCDF286B1C6F594172@TY4PR01MB17718.jpnprd01.prod.outlook.com>
	<CABPTF7VyH1-W2xnDspECDEzFGQj=WTFpZBCqKfM11OAZa6gQHQ@mail.gmail.com>

On Tue, Jun 2, 2026 at 3:00 PM Xuneng Zhou <[email protected]> wrote:
>
> Hi,
>
> On Sat, May 30, 2026 at 4:14 PM Zhijie Hou (Fujitsu)
> <[email protected]> wrote:
> > I'm not sure if adding an injection point for this rare case is worthwhile. Even
> > if we were to add one, future refactoring of that function could shift the
> > position of the injection point, so its long-term usefulness is uncertain. I
> > don't have a strong opinion on this, so I'll leave it to Fujii-San to decide.

I've pushed the patch. Thanks!

IMO the proposed test looks a bit too narrow, so I'm not sure it's worth
adding at this point. For now, I've committed only the code fix.


> There's an adjacent bug around drop_local_obsolete_slots. The root
> cause of them looks similar -- ReplicationSlot * is a pointer to a
> reusable shared-memory array cell, not a durable identity for the same
> slot. In drop_local_obsolete_slots, the issue is that the slot has
> been freed after ReplicationSlotDropAcquired(false); however, another
> backend may reuse the same cell before the unlock/log reads. This
> seems less severe -- it does not normally corrupt slot state, because
> the code only read after the drop. But it can still unlock/log
> misusing the identity of a different slot. Attached a test using
> injection point to reproduce it and a patch to fix it.

Thanks again for the report and patch!

  /* Drop the local slot if it is not required to be retained. */
  if (!local_sync_slot_required(local_slot, remote_slot_list))
  {
+ bool dropped = false;
+ NameData slot_name = {0};
+ Oid slot_database = local_slot->data.database;
  bool synced_slot;

Is it really safe to read slot_database before acquiring the database lock?

BTW, I'm also wondering whether it's safe for
local_sync_slot_required() to access local_slot without holding a lock.

Regards,

-- 
Fujii Masao






view thread (27+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Fix race in ReplicationSlotRelease for ephemeral slots
  In-Reply-To: <CAHGQGwE+2WSqiAYgNJRkf_twdB+uRGozjjGhUn76vUKZ8dzbSA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox