Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w9nxs-001m1E-1X for pgsql-hackers@arkaria.postgresql.org; Mon, 06 Apr 2026 17:42:20 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w9nxp-00At0w-1H for pgsql-hackers@arkaria.postgresql.org; Mon, 06 Apr 2026 17:42:17 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w9nxo-00Asz8-37 for pgsql-hackers@lists.postgresql.org; Mon, 06 Apr 2026 17:42:17 +0000 Received: from mail-oa1-x2e.google.com ([2001:4860:4864:20::2e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w9nxn-00000000tIs-2JyX for pgsql-hackers@postgresql.org; Mon, 06 Apr 2026 17:42:16 +0000 Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-40ef10ec84cso3017080fac.2 for ; Mon, 06 Apr 2026 10:42:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1775497335; cv=none; d=google.com; s=arc-20240605; b=XtXM98jimLtrHa8iQbbsmRpmtSD6I9I2xW+5t7qD8cfUSSm9Vb5UIX2wqNX+xekscU ymmj14FDplIfgcNHO/EJA0+AgYNUEldZurKCxSHRb3og01sFuy0AN5VvdNKxQJogyNPV UKmSkTt3Pkg06HwFpcIRqdRfg/UwI3uKu1fSFwxH4IsXAmPBPuBTDUSavf4CwBb3UIo0 M4nN/8iMb6KwJSXl/9P7ZiMA0Am0Z37zmPFpMsSXX/82F5oMSMZ+UxQfpA11fBNrxmz4 8G6rL0IlhWbGydepyNBF6oIYZXza/8VGfN7MRqi2zUPZGRYZiBWz4U16h+O+dpTFZqOa Y+Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=wMcif7S4LQZH/itobjXjxYePoDVmAS96XSr1cPWC+XY=; fh=CQlNeHe+MdM2eUs3EbMW9NOjxXLiry5UWO/D3lrvaWY=; b=Lm8rAEYrYEpKxEcUDicZko2xzUVK6giU8OJbqHZDP4UxjgnLhHB9fIad8l4Wn302Yg XSSv+XxuEe3XSLHEzE4oZMUcbhbAKn8R1mgNcfJ7z8nhseNR1xgHqqxn1k1TYuf3zLAk 7373sLozIay930WwErneXfKqUXCbGKWrw3Va1ydHkIqpAFjpC8sTdfWJwqs8WbNw56k0 Hi989gsQXjhv8wYmu8et1mCpQ/NpS9tfz5aaQpcSegGlTGEXdAM1vXS/RQJrRD09mM2b A8Soz2SAP4AVwtF9KjHWbMttiSazzK7aLX0Co+UjImQR1JT69iia9/HEs+iGdPJAt0Pc Y7Cw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1775497335; x=1776102135; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wMcif7S4LQZH/itobjXjxYePoDVmAS96XSr1cPWC+XY=; b=XLFZAYvzCcJ796BCF1Fe7aBGuvbt+j5HU/yoLezzKua/SZPjK9xA3A3kZi0Qj7IB4q pNLu7AnZHdvHmbFfCRYyJSHcPug62amBAhXWcjhkqwaW4eE+Erm3eYikS3qyTRfOrttE BH69QY6ajdt81iVSgkqr7PvVZIa7dTJ8ghvvsviQePCzm3is0CNu6rPt2oalETlWrd7E fyURT742jqnta6Yy5N3JEwFZr9jUG7wCR1P81TVPx2UHSNYSz0EljzvsgBLsvODtkgTn 7jw3Tpq9c3VWPWTAmW3OyGx7pY6w6+zAh/6obsFBv0+nKNf4UT9utxyd7tuYyTb8pysW 5G0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775497335; x=1776102135; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wMcif7S4LQZH/itobjXjxYePoDVmAS96XSr1cPWC+XY=; b=f3N3jTq4nmO4wke/wKKDiwHnL5KCjXmMnae3o86noITdn+kuK/zUlvHnVmwMvSaj5i tcnlc0APLyb0qXbaZCVuD+XtVOv5Ow+ijcBRnd9kR6fHgdL/O4YxB2RI6Tbji4Hq/KVM Up1mjOuXyOlWeH+aJ5lV1PRuDKWa+9AOAdNTaWr4mifnznF+FI9ppzwYS02s8hnKkZ8F meAlB94NUAthCQHYCOzyUTBQcwNfwhSNeaaEr/tzTg3I8a3HwQv/pC6F/XiUoZX+sIjR JjIoqIgV/bzHJ1xMS7SNpQKzeWQQJTw7wXtTq3GVz8dnquGaoEob1JG5swekdm9HsRHh tKaw== X-Forwarded-Encrypted: i=1; AJvYcCXnwkC0GE16A1QKm4GHoJNEjCA8/u6YXW666F/XYgCuXeHP/YB99Wsa3b5B8E3o0FnbVQ3Hxl5MV+9fOFz1@postgresql.org X-Gm-Message-State: AOJu0YzrNB9qUS3yr+MLJ1Rbrxo5nYEf+vNPVIoDD7yvNeMz+s38GCuF VKc4Py7rQjj5ETa+MyX+g80jL6A2VriUnbhzmczoYQ+reDBKuZWsV/F+DA+Fa0exin9J+QpWjpu eD4STJFf1zIKI54TvufBgHT7ul/33VYo= X-Gm-Gg: AeBDievW0iAy6OfxSDKEW3whOOC2A8GFLxmL+ljv52GNX8MBYDRchGJDohIkC1yz7zR LsahwPQOdMAqkmCjrzIUPhwEJj5nWY48a28JjALrysMol5ZmEKLwajYVWCQwBC8Aa1o/fl16DlH Xqfg+1ePsna0idJiMovnVPjfZtrX8LgpRJOd2e6RaH8JQkfEAKKHKba4/D8Bhq1+GcvKUpxGJ86 LwLGVuv+bZoTze05cCp7UFWfVv+CUFM3mrhTBEJsgf8op8NvBcVF4uBJ//TdukDf2Q2tlmDp10C /fzPzEPaDdiHU+WG X-Received: by 2002:a05:6820:c87:b0:67e:34a0:a3e0 with SMTP id 006d021491bc7-6821d747d77mr7802906eaf.8.1775497334938; Mon, 06 Apr 2026 10:42:14 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Bharath Rupireddy Date: Mon, 6 Apr 2026 10:42:03 -0700 X-Gm-Features: AQROBzB_fnPtza5_fkzqrkodhQ7m4ZJey2LicMlA1HBK9Jb9uBGNVUXk_AZPH2o Message-ID: Subject: Re: Introduce XID age based replication slot invalidation To: Masahiko Sawada Cc: Srinath Reddy Sadipiralla , SATYANARAYANA NARLAPURAM , "Hayato Kuroda (Fujitsu)" , John H , PostgreSQL-development Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On Mon, Apr 6, 2026 at 1:45=E2=80=AFAM Masahiko Sawada wrote: > > > I took a look at the v10 patch and it LGTM. I tested it - make > > check-world passes, pgindent doesn't complain. > > While reviewing the patch, I found that with this patch, backend > processes and autovacuum workers can simultaneously attempt to > invalidate the same slot for the same reason. When invalidating a > slot, we send a signal to the process owning the slot and wait for it > to exit and release the slot. If the process takes a long time to exit > for some reason, subsequent autovacuum workers attempting to > invalidate the same slot will also send a SIGTERM and get stuck at > InvalidatePossiblyObsoleteSlot(). In the worst case, this could result > in all autovacuum activity being blocked. I think we need to address > this problem. Thank you! You're right that multiple autovacuum workers can wait on the same slot for SIGTERM to take effect on the process (mainly walsenders) holding the slot. Once the process holding the slot exits, one worker finishes the invalidation and the others see it's done and move on. However, IMHO, this is unlikely to be a problem in practice. First, SIGTERM must take a long time to terminate the process holding the slot. This seems unlikely unless I'm missing some cases. Second, the slot's xmin must be very old (past XID age) while the process is still running but slow to exit. If we set max_slot_xid_age close to vacuum_failsafe_age (e.g., 1.6 billion. I've added this note in the docs), it seems unlikely that the replication connection would still be active at that point. Also, concurrent invalidation can already happen today between the startup process and checkpointer on standby. If needed, we could add a flag to skip extra invalidation attempts based on field experience. Since this feature is off by default, I'd prefer to keep things simple, but I'm open to other approaches. Thoughts? --=20 Bharath Rupireddy Amazon Web Services: https://aws.amazon.com