Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXzWt-003be6-0U for pgsql-hackers@arkaria.postgresql.org; Fri, 12 Jun 2026 10:54:27 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wXzWs-001yaW-0A for pgsql-hackers@arkaria.postgresql.org; Fri, 12 Jun 2026 10:54:26 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXzWr-001yaI-26 for pgsql-hackers@lists.postgresql.org; Fri, 12 Jun 2026 10:54:25 +0000 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wXzWp-00000002KWp-40Dz for pgsql-hackers@lists.postgresql.org; Fri, 12 Jun 2026 10:54:24 +0000 Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-36b9b15af73so879262a91.0 for ; Fri, 12 Jun 2026 03:54:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1781261663; cv=none; d=google.com; s=arc-20240605; b=BSjuIRXeV7hM41/vlPOc5Sm2X5y5ugZVvi0bCBlg1p3xdrmrGrNM+4KSRsnC4jQQkH N2ovOozb0fmxA94zPOkkrFU4kbRKqwpdKlU94fL0q3cwiYpTjHcvb9YtT1K86hJNy1uR el40G+Anf0jyPePhwEfiId69Vc6dAmp/6yuhcj2wfG8LtxgFnYeURhquHzmKN/i/hn5V xcCaGEjuJIshmB5WRIwVjdqtnQ1zDf2EdPmM4L/P3LZKNVR+ES1DjLGkAzjYxGi2ZTt8 xVluoiTQoNhyXZRSv+og3zRB02AMrW561c5tX17iMySDepw2CjDrfTIuf6A8vIl5KTw8 5DBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=7NQ/AQuVlJZgVsWCVxqIE6uN4f/pnHd2OmtpAoQ5C4M=; fh=5ipBWNcqC7fmi1WXyw6e9Rji2OzDj0jwy0Ljr5Bxd54=; b=ZWKHDZZBvwLaw3Umk5wGp+7/2fuG5Q9WY2jsuCswEkFue0s5KvWsa+sZF+8VPDX3n5 Hz+v4ef7gA0OyWNjCvxODjF4dEmXczREbNSg7h6moOs31EeBRqjFa7enldJZf4vWOCyQ gSWmITmVfxriJetFVzfXisYwP3PNu7QPwHobRXIszaheTYvx7HXsLM31clThAyoeaNx6 DOzfey5zdIDcPNIPuRag8d5WM/qsebyH4GrBuUmOIjGpFxPdufekLy8InY8E8S6ib4Ie 4CXqLU7Cf/aDOa/mB6QDSLPUCS9SN4681SdIZRxG6n+CzyQC9BLLlga6AoN+JE4NyE98 NSSQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781261663; x=1781866463; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7NQ/AQuVlJZgVsWCVxqIE6uN4f/pnHd2OmtpAoQ5C4M=; b=Zcwec1tNcOdaDe+CDN/0TSv6TGZIRZZX7fsbzOXrmQBPuii62MAXdaZUcyuH7jql80 kGGVrKHaMndk8jicHJvCROKJU7oP50k+PvMtGP8kUOzjIc2Nfyw9/ZuyqNsdt4uQiDNt Gv6akyVOPqFCRRLdO/Q4nz6ekbJtvxSyk7mriNwrfkT6iS4cVmalOP8wgi5YSn3QiDXP KpPkZ1HHKX+Z9sfYkSQ/ycHfCtPWBFem4gSJfr+AA1LdfRumz8Kwp4sFP1EllVNndztn r6PBnkYRN63gixes6RoLu+igw4mqBAVn3FG28zq2Lnog+c8EAo3S7NHv+d+GHJmdtg55 ZoEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781261663; x=1781866463; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=7NQ/AQuVlJZgVsWCVxqIE6uN4f/pnHd2OmtpAoQ5C4M=; b=jUQmVpAebSAEKwWk+SCjIwgJ4EiHH6Dvh/iWO8UEXXdAVkZ/SrBvEb+prVv54T2rsW X+d/o8GgG6+pa6Wyj5QHxAbu7HgP4+GAC0JHY05VojfoCZKKNoBIkfYMmT8qJVWxhzIN gDbgPIdwlxRMcbI2T7zJ9QZTlx22m8Zipt8xzMrSrb0DDNu1BmcBMJAgS5OGbXTnmL5C hhuZD87AIGKhPzSxluvW/mVoxAUxvIktNFfu1m3FAKoVqxkHl9Yq3Qt500CS7VkmKKTM fsw7y2DhmG4ZYLq/QrN7nSR20ZL+5PHXhYDrYrGtdYy0VNwf3+Ru38kMoZbZ2UDPUyjR +OwQ== X-Forwarded-Encrypted: i=1; AFNElJ/4t4RjQ0vMAQSf4mWiRacgy+Z3uo9VeBqWRh1PRw11buqG0d630xunYCMy+jCdg72zT9EQy/4m9bv1kwAl@lists.postgresql.org X-Gm-Message-State: AOJu0Yyxd0uQKB9XiYa26tIGO9huKPtfWKNpheCK1fWfrdsuWnLUKRi5 Mf7j/sVNMXK1RUZ5dV2wkzyt/mP0Ny3/G81Xxe0aiNgHPSbyX3R2G+jSUd/B5Gtvz1OS5pxP01/ /XOz4bv6pD5+5J8HbQPs/9KbXOIXuUuH3U9umndA= X-Gm-Gg: Acq92OHhPUFfmRvK+v+ZLjF4Pq/Nquvv1Kri5dajC1CselW8FHLupJ07uI4L+qI+blQ gGhpFvipMVZLjdiFlXF0YV3w9qR9Pjujnud4KbIkymhrdHFZrhstSZ9wRCM7pFkl4v+qxuNP8yy XwqT7IvUhJ0uXSgC0Q9tlbZvWoL3+dnrr1PHMsTCqjksf5gRVoL6ulf1m60dhmmcI63ujB09LSz r5YHUvFiS6dZeej+McAdbRxZeW2OfgAK4bsC9ESA4u6j2BF6wmAIL3iwDVyy3bj/I8ijvZraJYY iSyHJzZxi7Um+vxosMfwTqZCOz1BZG+hVMi8UnZoOxYZV4vhCGVT8hUaIoYfdib1BsgzsoXN3bV Ec1pcDaWhmq17bJ3xDLIHvwgf0CzCaD5ny0bUnUdgPg== X-Received: by 2002:a05:6a21:3115:b0:39b:9644:6e94 with SMTP id adf61e73a8af0-3b783c76cf4mr3179786637.9.1781261662741; Fri, 12 Jun 2026 03:54:22 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Amit Kapila Date: Fri, 12 Jun 2026 16:24:11 +0530 X-Gm-Features: AVVi8Ce1oPA12oXJM5pgq4aMYPlIh69RYLggvwv3Xi4k1aMH2312tNkmXXAqQGw Message-ID: Subject: Re: Fix race in ReplicationSlotRelease for ephemeral slots To: Xuneng Zhou Cc: Fujii Masao , "Zhijie Hou (Fujitsu)" , Srinath Reddy Sadipiralla , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Jun 12, 2026 at 8:22=E2=80=AFAM Xuneng Zhou = wrote: > > The issues look real to me and could be dealt with patch v1 partially. > > On Thu, Jun 11, 2026 at 9:19=E2=80=AFPM Fujii Masao wrote: > > > > On Thu, Jun 11, 2026 at 8:18=E2=80=AFPM Amit Kapila wrote: > > > 1. Stale name read in local_sync_slot_required(): The reused cell > > > holds a different name. local_sync_slot_required() might return false > > > (drop needed). But then the in_use && synced spinlock check sees > > > synced =3D false and skips the actual drop. The wrong decision is > > > caught. > > > > Yes, we could skip the actual drop. But then wouldn't we still emit > > the log message "dropped replication slot ..." even though no slot was > > actually dropped? > > With v1, we won't emit the log message unless the log is factually > dropped. However it did not prevent the stale read in > local_sync_slot_required(). > > > > 2. Wrong database OID read at line 551: The reused cell holds OID_B > > > from the new slot. We lock OID_B, then at lines 563=E2=80=93565 we se= e synced > > > =3D false, skip the drop, and unlock OID_B at line 579. Since no drop > > > occurred, the cell is still the same non-synced slot, so the lock and > > > unlock see the same OID_B. Symmetric =E2=80=94 no lock leak. > > > > What happens if the slot for OID_B is dropped after we lock > > OID_B, and then a new slot for OID_C reuses the same array entry? In > > that case, wouldn't the later unlock read OID_C from > > local_slot->data.database even though the lock was originally taken on > > OID_B? > > V1 stops doing the venerable second read of local_slot->data.database. > So if the copied value was already stale and points to OID_B, v1 is at > least symmetric: > > read OID_B once > lock OID_B > cell reused as OID_C > unlock OID_B > > But v1 seems not to fully solve issue 1. > > It can still do this: > > cell already reused before slot_database is copied > v1 copies OID_B from replacement slot > locks OID_B > recheck sees synced=3Dfalse > skips drop > unlocks OID_B > > That is still a stale read and possibly a wasted/wrong database lock, > but it doesn't leak the lock, unlocks the wrong object, logs a false > drop, or drops the replacement slot. > > In an off-list chat with Zhijie, we kinda thought that holding the > lock of a wrong db for a brief time doesn't seem to harm a lot. The > concurrent dropping-db operation leads to this issue seems rare in > practice. He stated that the deletion of the slot seems unavoidable > because we have to acquire the database lock after releasing the > replication slot lock to avoid the deadlock with the startup/drop db > operation. Therefore, he prefered keeping the design simple and > avoiding the fatal issue over doing a broader refactoring work. > +1. I also think this change is not worth it. > I > don't have a strong opinion on this. Still attaching the refactoring > patch to do some clean-up in case someone thinks it's worthwhile. > I feel even if there is an argument to do such a refactoring, it can be done separately. We can push forward with 0001 and then do more discussion for 0002, if required. I can take care of 0001 unless Fujii-San wishes to take care of it? --=20 With Regards, Amit Kapila.