Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXbcW-003MOD-1P for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Jun 2026 09:22:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wXbcV-00FMfV-17 for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Jun 2026 09:22:39 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXbcV-00FMfL-0B for pgsql-hackers@lists.postgresql.org; Thu, 11 Jun 2026 09:22:39 +0000 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wXbcT-000000029hE-1VH4 for pgsql-hackers@lists.postgresql.org; Thu, 11 Jun 2026 09:22:38 +0000 Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-36d5b11201aso4989987a91.2 for ; Thu, 11 Jun 2026 02:22:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1781169756; cv=none; d=google.com; s=arc-20240605; b=c5vxhHfeZlxIBkNSCubtjEHgiaeHhCfRQFF731B42P/XcIrND0R06o6U4gnrtvqcRS WKk4AczDGE0hpeFuMljObKOowfOlnm17WggE4OuI1ntC5Lw6KeXRIGZSyCNMureyqe/f MlelRpv8kYi5irFylOp208SOfxToShGVAI4C6p1e6vMqVo5W7J9HWvpMroxaN/zwlLqp V+ToVOsayRQPeRvP3Pak7eSqZxeeI0O4rGjTqoANwkYuqAuJbXCAnWn979jJXbaYCPEb 3QxQDTAvdE+JmakMRomDrBr6qw/CYG2Rl+BZ/mLidzBKEm/uoG7ST/5/rZM+i1JDch7p xQlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=xZy5/a0LSrIFGf7XiDZNgw2t2pGW2p/Q+LdGqhX15Gc=; fh=m+j0xgz0gZX92EDRGWeA50nCviUGeYjSJtuVbiurq5Q=; b=gYtIEpAGrNpitry6jb8WkD3aAqZjZBOVbUzREUv8Mv5YE8sbqrD4uK1E/dDUCosOTy /8Ylgwk/8Fhfe4x3Thqf0Ln01lcZGb1T1GjiN0cC50AKPD2z24EiKTze5/GrU0nVuDM0 TlKYAudGXklAbnRPSZAnyN3qk1jb08zN+NRCytYRNh3JSdrOTdvRbAGrSq0fw/OjaCWd GTf4N7PwWvLCcsQ+7YqEzshT6B7JVOsoA1NS/xnMzmtEstonqy+RLVQDTMgEjQSFQUO8 vi93HkPFe26U0PuFq3ytAcJ+HY4hMVWFz9hBceR0kIJI5XwwH4Wt/qoeA2nat3CWELK7 Oo8w==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781169756; x=1781774556; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xZy5/a0LSrIFGf7XiDZNgw2t2pGW2p/Q+LdGqhX15Gc=; b=dED72xDFenqMkwE0PefQFhf5FkUwjSW+r1uZvRK0tBmu//r1uK3wii2ywc/wCVFIuf TWgwlHVoZ3tHhHbcgKi3dU7kCt1ebkMRvCAm1T2QarT+Ga3TuWvX1MdCPe676iSTZD7M 0YhGV4Eimvjp3/tzf+BOW+GlokzN8DAqwTC60/Ba+Ss2triPfA1CzCzVoLS6brcz9Lmv zsMb71aIYIl3/1EfIyaseem2vem70ZW7tY3o2MFYRbhtlwm3PAj3QTeiS7xDIHrzMegN 5/fyTeqnRbKzxo2+QM4bOUnuajVpg2xEfiddXeM4Hx1VXtlhFCL3LIY2eiQbnVw5+EJe pXQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781169756; x=1781774556; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=xZy5/a0LSrIFGf7XiDZNgw2t2pGW2p/Q+LdGqhX15Gc=; b=AcoK3DVbq3fvlXqyFKmHGXB6trEMWZrNto23oBPmkJeE29w3QjZiAL4GdFo4c2jjbR xQ3TRSne88945vV32CYAJnGAwu22ejXmHjjSOwNHYODF18D1YIbTiVCo8Dn1mPcWNyQL IIF6AyErX0OpRMPUrYtXSQq4PcZv7yGysm63qeyQCto+TJZoqJGJO5BixTSJ0Y3l+MT5 1wkJgzxIxpP1uFkcVlYoej9UjsjUPc934cyPQ+6/eVik2mwey97UDOw9athM+KMgPBKB 1KIC4x6yqL322C191HX/LhnDwuGRrewJbqInTBcb2DhdgmIYMoSEWKYU9C0wNzwgCDEF P8sA== X-Forwarded-Encrypted: i=1; AFNElJ8mN5aueBIJvKpG54gg6w+DHtNfFt81QWqgE6KbLk3o6ONUlyAKYXR4gMf3tAw6fpA4N2tne/9RI+2yMH0B@lists.postgresql.org X-Gm-Message-State: AOJu0Yxfnjug7ruiYzdeGOvJTmSvIS+/lr43zQ4p03nUUKhp2rmSw8Y8 GG2L4WrHAfStlAHIlg83ig1IdioKHMikoPZUPkHn7WTRBO6BPP2nCgKHE/cPUpwnJxHKpCWXMbG wuGNy2QNvf3mfyg0aneWyzovFiI/Y3Kk= X-Gm-Gg: Acq92OHBzfjnLh5JHBcHdMxu14kXxboyrHplSuZOgjI+uxfEdR3l5DBUDnEJwBOURni P1pmiRQRZ0sXdQghMyKDZzx7XwuB765SalwcK0ZH9Aqu+CK7nbEww40AF2YJEu1BgTitzdWMhUU mUx42+2MqkDGr/slYPGdaUxColkR3JN+X1rQ+BpVEkMdgNhUa/SJVPYTE3Xg03zx3vBueRzXimD 2yuOapqcmS4Jg+uWqY3F4pEW4HoEA4F9Zzc9soulplfvFESLAtA1tTbjgytWEzvZyrTq+c9ZCUC J0psIgunQPxYaQhkRp+dsgg4Z2EHbmuUIoglQ59ygPp3USOtI/c2MadD8/0GzHM++8/H6M/Rlcf V+XIqtma1jhnPfxyD6BU= X-Received: by 2002:a17:90b:3849:b0:36b:bec8:94c5 with SMTP id 98e67ed59e1d1-377a15f6d19mr2384931a91.10.1781169756260; Thu, 11 Jun 2026 02:22:36 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Amit Kapila Date: Thu, 11 Jun 2026 14:52:24 +0530 X-Gm-Features: AVVi8Cend3O-12JfiJkhBwnCfdQMZFApvMS65JvpOzZ8aWdpYR79ScKs7e8_OS0 Message-ID: Subject: Re: Fix race in ReplicationSlotRelease for ephemeral slots To: "Zhijie Hou (Fujitsu)" Cc: Xuneng Zhou , Fujii Masao , Srinath Reddy Sadipiralla , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Jun 6, 2026 at 3:05=E2=80=AFPM Zhijie Hou (Fujitsu) wrote: > > On Friday, June 5, 2026 8:45 PM Xuneng Zhou wrote: > > On Wed, Jun 3, 2026 at 8:03=E2=80=AFPM Fujii Masao wrote: > > > > > > On Tue, Jun 2, 2026 at 3:00=E2=80=AFPM Xuneng Zhou > > wrote: > > > > > > /* Drop the local slot if it is not required to be retained. */ > > > if (!local_sync_slot_required(local_slot, remote_slot_list)) > > > { > > > + bool dropped =3D false; > > > + NameData slot_name =3D {0}; > > > + Oid slot_database =3D local_slot->data.database; > > > bool synced_slot; > > > > > > Is it really safe to read slot_database before acquiring the database= lock? > > > > Reading slot_database before taking the database lock seems not > > inherently unsafe by itself. The comment suggests that the lock is > > primarily used to prevent conflicts with the startup process running > > ReplicationSlotsDropDBSlots() during db-drop replay; it does not > > protect replication slot array reuse. > > > > The unsafe part could be reading slot_database from local_slot after > > ReplicationSlotControlLock has been released. At this point, the slot > > array cell may already have been freed and reused, so the value read > > may no longer belong to the slot that get_local_synced_slots() > > originally collected. As a result, we could end up locking the wrong > > database. > > > > There seems to be two related issues: > > > > 1) Before drop: reading local_slot->data.database / > > local_slot->data.name after the slot-array lock was released, before > > verifying the cell still represents the same synced slot. > > I recall condition (1) is considered acceptable, since the database lock = is > released immediately after re-verifying that the slot is no longer the or= iginal > 'synced' one anyway. Additionally, this race can only occur when replayin= g a > DROP DATABASE, which is rare in practice. Since we only take a shared loc= k, it > does not seem to cause real issues. > It seems that (1) is talking about the access to local_slot->data.name before we acquire database lock in local_sync_slot_required() whereas your response doesn't seem to address that concern. If not, then how exactly does the database lock protect what we are doing in local_sync_slot_required()? --=20 With Regards, Amit Kapila.