Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wasEC-0027iP-2m for pgsql-hackers@arkaria.postgresql.org; Sat, 20 Jun 2026 09:43:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wasEA-0019m3-1x for pgsql-hackers@arkaria.postgresql.org; Sat, 20 Jun 2026 09:43:02 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wasEA-0019lv-0u for pgsql-hackers@lists.postgresql.org; Sat, 20 Jun 2026 09:43:02 +0000 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wasE5-00000001PIX-0Ov3 for pgsql-hackers@lists.postgresql.org; Sat, 20 Jun 2026 09:43:01 +0000 Received: by mail-pj1-x102d.google.com with SMTP id 98e67ed59e1d1-36d98c9b596so1540654a91.3 for ; Sat, 20 Jun 2026 02:42:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1781948574; cv=none; d=google.com; s=arc-20240605; b=MkQid5BY7nz7RIiSCN3Dinupl7NwvcD3yKKmeAYr2YJnxngAc1jpppBr0bch8MU0pQ IG3jC+X9rIiakAHHJ27tKe1IXR+JFBzQQ6AVdASNqVCoBwr/YeyNXPEvTaQT+MfjlUeN RHMziwW2BKW3FlbVBXBilz94osgn7Asl6iDBmQVNMA6z2Nc5VuUN8qFOvVjhI+M28Hfg EgkVOad68qUpqTPpMWjsjEr69rfTD3dor003YQSkb44Ih+bF/eyvXNz/gyrPfY2G0oeq 9YNPnHEQA3Jh6ooHm0DfAq5Bv2LDePSAq6bb4Wfu60UPM5n4hyqbOxxgvXU40jOmKXGO PSBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=VSGuWekG+aJBtUaARbyKNq+E813Vwi2KktoT5fHnwAw=; fh=6nDTZBD9BrpTFvNwMmUzJApFGoqPOerawYOux/yg0DA=; b=IrIlYhzYdFmD0de9nEIGqzbC4V55uhhYwROVcP7if6U4wwL0hUT1TZ0Jx1l4l9e1Ok SvLz+WOB7zNVIWti6QMwqWtMvhZnEuzMVOi9WbhEusuag8f4xu25hem5354Zf/qFRC6e 6zVfBZ5UgPkrJ8xPmldR3IbC+yJidCVoUA9+3p84QxE1PHCugR01sj20YDH2/O6LQZ2m SolwXxCfPUnXu4MQzo4jOB+qnVlOIn9cMSoyb0QvoK7RqlMLfeb/Tdwm7Kp9SYSJLZfj VhbpTUSVv4RZEJ4xp2hgQY42DDWr0KHxeKxDbi4BZhWknjIshaLSnktwkx6PLf6iEByz UDMQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781948574; x=1782553374; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VSGuWekG+aJBtUaARbyKNq+E813Vwi2KktoT5fHnwAw=; b=KsIGw9AE9F0oFHlIQ4D3h+mr6fZ36fwMlIiD5aeWwZozNAGaTC0Ysgoca2eS6HzDKi lDbuIHZu0Zq269w44KW+P8XzWsQWO23h8EjG9TWXeaIbwaALDcsDVrllgq+n7x79A0Kr kVyVx0ogwdrGVZGmr6RspOI9zr2EwiOi2VvbI8VFvKyWcOm8XsMGbpchaKfKGV9bFu9E Q3A15+RFv7NKo0pEAB/lQPToGV6d8tuD8ouSiOTC2fa/tAgzHSOQqMjk2woBS6Dh0SKq eLtSvQBTnSCko1L2sdf8ZGwfOzsavuQtmIDM0Grkz9mGJ7y5c5zkVfOaNNgC6znUZ1aG Y1vQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781948574; x=1782553374; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=VSGuWekG+aJBtUaARbyKNq+E813Vwi2KktoT5fHnwAw=; b=FE/H3IrMOym1W2eVkVTw19UAgbHpIPsgRwsMQfZxgKH3Qy0ppuRdpSUgu2UNXnjyei ostsRZmeg0wTBqh7W7MPoGG2xEiMzCfrhwcAIRIElhbwjqQyrdOOD9tPwvO8m1cHSR0z gTiJFFBYIz2Ld3qVcp1wzxzPqt/pMUAaU5IoLzzAMEbe4jOwt8kp5tcOBCUWxEhnuMAe Lrosio21y3T5iDvF3YWj4YXN2I4Y1LqAhy1U95GI+8J/B1UepRI6kJlX/LUEANRcKK+g lWYIKcud0q6VoixZskeJTBw7ZlTWXp/+5WEdsq9+oo98bKdvehPAg/6PrLsr8ATr8Isc ccow== X-Forwarded-Encrypted: i=1; AFNElJ/XgQgFncuEkhU1RdR4RfhzeqmBtQUlqKqwmoFtiq5lwV7AR9wymL3gu4gzXgSvSt0K+JNethJGEFNkH7Gs@lists.postgresql.org X-Gm-Message-State: AOJu0YzeHJ3QJwuDmxKyj1pcDEJkP0H7gb4hHHcRAlSvOQwyVq0elYCT lZ8lrtjrgewjARGjAJ7lyHoJBfKryCJ1RtaCjgxbZFo3VIbNbW3l6iHMHL9ZhVD5QmrzqwuAD+E c9Mg7spc9qXHfxbRL4g6UFRaGO9hgfHA= X-Gm-Gg: AfdE7cm51SSturIMuWSmYfQtqJbptUnMEE1j4w8Tnbhp2V8J6zihrtwaMjX/ZGSrRM2 0Sv+ct7r97GKqicXDakFht5r4ifEl8GX27uIJF7P8cR1DtZ7Z8XdbDfHWUchfBoNYKqPHohEW1U rhmEZS6iTBzyAd50iuaT4fbp4IKPk1HvORFqz1WBsYvbT6lg6E2JJjz1acsob+69cOzjef1yaiq zpsSrdCI+GNr9+z3+qI8XodHaLJmHneS7g5Xw1RNhXcrII+1trfyoTdajiaVGI5pY0/7PS8I3GG EekWAHHyWk5zemglbBg54I8QY0sH9Af/4ukFlKLTYGgwhlA90axLE/x64jTYvdt8bnS4aGHMYQ= = X-Received: by 2002:a17:90b:568d:b0:36d:633a:e8aa with SMTP id 98e67ed59e1d1-37d160bd73bmr7082753a91.13.1781948573646; Sat, 20 Jun 2026 02:42:53 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Amit Kapila Date: Sat, 20 Jun 2026 15:12:42 +0530 X-Gm-Features: AVVi8Cc0iJWCovNapYRSLXBoCk1eS6U-Ab7MgJT5hFGQExDtcgsFCO-oggQBBFA Message-ID: Subject: Re: Fix race in ReplicationSlotRelease for ephemeral slots To: Xuneng Zhou Cc: Fujii Masao , "Zhijie Hou (Fujitsu)" , Srinath Reddy Sadipiralla , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Jun 20, 2026 at 12:11=E2=80=AFPM Xuneng Zhou = wrote: > > On Fri, Jun 19, 2026 at 8:08=E2=80=AFPM Amit Kapila wrote: > > > > On Thu, Jun 18, 2026 at 2:06=E2=80=AFPM Xuneng Zhou wrote: > > > > > > OK, how about elaborate it a bit like this: > > > > > > /* > > > * In the small window between getting the slot to drop and > > > * locking the database, there is a possibility of a parallel > > > * database drop by the startup process and the creation of a new > > > * slot by the user. This new user-created slot may end up using > > > * the same shared memory as that of 'local_slot'. > > > * > > > * If that happens, local_slot now describes the replacement slot: > > > * local_sync_slot_required() may have made its drop decision using > > > * the replacement slot's name or invalidation state, and slot_databa= se > > > * may refer to the replacement slot's database. Thus check if > > > * local_slot is still a synced slot before performing the actual dro= p. > > > * This does not prove it is the original slot, but it prevents dropp= ing > > > * an ordinary user-created replacement slot, and the copied database= OID > > > * keeps lock/unlock symmetric. The remaining risk is limited to this > > > * cleanup cycle, such as briefly holding an unrelated database lock,= and > > > * is acceptable here because this race is rare. > > > */ > > > > > > > Okay inspired from your and Fujii-san's version, here is a third versio= n: > > /* > > * In the small window between getting the slot to drop and > > * locking the database, there is a possibility of a parallel > > * database drop by the startup process and the creation of a new > > * slot by the user. This new user-created slot may end up using > > * the same shared memory as that of 'local_slot'. > > * > > * Because local_slot still points to a reusable slot-array entry, > > * its fields (name, database OID, invalidation state) may already > > * describe such a replacement slot by the time we reach here. That > > * means the drop decision made by local_sync_slot_required() above > > * could have been based on the replacement slot's data, and > > * slot_database could refer to an unrelated database. The recheck > > * below keeps us from actually dropping a user-created replacement > > * slot; the residual risk is confined to this cycle (for example, > > * briefly locking an unrelated database) and is acceptable because > > * the race is rare and non-fatal. > > */ > > > > Thoughts? > > LGTM. It looks well-articulated. > Thanks, I'll push this as soon as the PG20 branch opens. --=20 With Regards, Amit Kapila.