public inbox for [email protected]  
help / color / mirror / Atom feed
From: Zhijie Hou (Fujitsu) <[email protected]>
To: Xuneng Zhou <[email protected]>
Cc: Fujii Masao <[email protected]>
Cc: Srinath Reddy Sadipiralla <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Amit Kapila <[email protected]>
Subject: RE: Fix race in ReplicationSlotRelease for ephemeral slots
Date: Tue, 16 Jun 2026 08:54:11 +0000
Message-ID: <TY4PR01MB17718F4D0C5C8EB96A303C2E594E52@TY4PR01MB17718.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <CABPTF7VdFwiROsch4T7VbOCqQYpRbh==gAZPM6tJeff5Ou80Qw@mail.gmail.com>
References: <TY4PR01MB177184FF9EE916F577E1F554194082@TY4PR01MB17718.jpnprd01.prod.outlook.com>
	<CAFC+b6o-hD5VxVLZQovmHSYykF8Qzq3eiuBU-U1F_yR9-y6P_w@mail.gmail.com>
	<TY4PR01MB177180A7CE60BCDF286B1C6F594172@TY4PR01MB17718.jpnprd01.prod.outlook.com>
	<CABPTF7VyH1-W2xnDspECDEzFGQj=WTFpZBCqKfM11OAZa6gQHQ@mail.gmail.com>
	<CAHGQGwE+2WSqiAYgNJRkf_twdB+uRGozjjGhUn76vUKZ8dzbSA@mail.gmail.com>
	<CABPTF7VeA8szPv7LYDVY9_7LftV-HM8NFVQR2natPKmr73JW+A@mail.gmail.com>
	<TY4PR01MB1771887D33612C5A45F7E9CDF941E2@TY4PR01MB17718.jpnprd01.prod.outlook.com>
	<CAA4eK1LqFBKCkX2eoX3iQPxJJnzWTaCpdh9zNotxuoG8BgjdtA@mail.gmail.com>
	<CAA4eK1LkRdbm5XA=qa82Rp_y4rnyJh8pypMWVqOezOZpzy=Oaw@mail.gmail.com>
	<CAHGQGwG_3ff4HciHtTZ_uMvbJgSDWsz4Yawj_zQpDG6Yj=Mjng@mail.gmail.com>
	<CABPTF7WBh_mKi60EYLiueaZ_cdJvnrOrpSt3hQkuZ_uY4w5duA@mail.gmail.com>
	<CAA4eK1LJ9=BJU2oK5aFCfvW=w2muSXNHOPM18wHXHLkRzYxhTQ@mail.gmail.com>
	<CABPTF7VdFwiROsch4T7VbOCqQYpRbh==gAZPM6tJeff5Ou80Qw@mail.gmail.com>

On Tuesday, June 16, 2026 1:30 PM Xuneng Zhou <[email protected]> wrote:
> On Fri, Jun 12, 2026 at 6:54 PM Amit Kapila <[email protected]>
> wrote:
> >
> > On Fri, Jun 12, 2026 at 8:22 AM Xuneng Zhou <[email protected]>
> wrote:
> > >
> > > On Thu, Jun 11, 2026 at 9:19 PM Fujii Masao <[email protected]>
> > > In an off-list chat with Zhijie, we kinda thought that holding the
> > > lock of a wrong db for a brief time doesn't seem to harm a lot. The
> > > concurrent dropping-db operation leads to this issue seems rare in
> > > practice. He stated that the deletion of the slot seems unavoidable
> > > because we have to acquire the database lock after releasing the
> > > replication slot lock to avoid the deadlock with the startup/drop db
> > > operation. Therefore, he prefered keeping the design simple and
> > > avoiding the fatal issue over doing a broader refactoring work.
> > >
> >
> > +1. I also think this change is not worth it.
> 
> I am also OK with the scope of change made by patch 1.

I have one minor comment for the 0001 patch.

+			NameData	slot_name = {0};
...
			SpinLockAcquire(&local_slot->mutex);
 			synced_slot = local_slot->in_use && local_slot->data.synced;
+			if (synced_slot)
+				slot_name = local_slot->data.name;
 			SpinLockRelease(&local_slot->mutex);

We can defer assigning slot_name until after we pass the existing (synced_slot)
check. Since it's a synced slot, no other process can change it at that point,
and we can also skip initializing slot_name. (Please refer to the
attached patch for suggested changes)

Best Regards,
Hou zj



Attachments:

  [application/octet-stream] 0001-comments_patch (1.3K, 2-0001-comments_patch)
  download

view thread (27+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: RE: Fix race in ReplicationSlotRelease for ephemeral slots
  In-Reply-To: <TY4PR01MB17718F4D0C5C8EB96A303C2E594E52@TY4PR01MB17718.jpnprd01.prod.outlook.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox