Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w4xrn-002kbB-0s for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Mar 2026 09:16:03 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w4xrl-005W4i-12 for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Mar 2026 09:16:01 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w4xrk-005W4a-2u for pgsql-hackers@lists.postgresql.org; Tue, 24 Mar 2026 09:16:01 +0000 Received: from mail-ot1-x32b.google.com ([2607:f8b0:4864:20::32b]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w4xri-00000000qpN-2ccT for pgsql-hackers@lists.postgresql.org; Tue, 24 Mar 2026 09:16:01 +0000 Received: by mail-ot1-x32b.google.com with SMTP id 46e09a7af769-7d75371d873so4200654a34.3 for ; Tue, 24 Mar 2026 02:15:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774343756; cv=none; d=google.com; s=arc-20240605; b=dC3szKXnYxTX+hj4vX3CFnoGka/TrkEktVShkXYxp9LtTH7AUxNZz3QQYo3ZRXMgDg eOTA/vh78s/oZqc3TXPaxvGxtyVwXC26VJMHRtULjQJ6GLMnkA44pnYMrFNFVpaHSE9P LF7GEMWywBdSvWYdn2jZCZAu/x4t5I5iHrhEmpaEEgstMExX0q//xJCho4geH9/BoeqV ud1O89+c+V7Ub/gMeH+Qj4r3Exlpb4+B2yatFQRl68LGHCuBjKOaW1D54WkH0RwMXAiF paPjI2XZGX61j2I5I37+8ujD/0Iz1iF0tJwt28btiZWa9cnOOADwTfreme+DW4S2vdny 3C+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ZbSWPVjf0AHNIbO37TM9rPRE86YFZSdqH9qdswoKZA8=; fh=JXdnRISH9qS9RnFZcg8haP7lFCCYLjMAtGoTJdXvxY8=; b=e2mEh4MDsvgNda0r5m3UVMVxR/U1xkprzzECaqyQJmYiLxLRPWoJTrjTwCPg/qfVM+ O3crgEf08Bx7g3nSARnUcu8YhF4Cg64ccuycs+3MVaXeJCMH+dFsYQsfLKgbTawnFRmb qt7Y0ZzdYpMYVDhcHMYU/jakXyoy7CdhWxeecumRkaZBrX7qVI3JqgPxrJ/Ht0fPgONh uLhG0mmrXhYK3+hyRB9e3UvFI0hS1cf5e7SUs/E4l72TMV2dnrN7eWnbYISt8kvWQeB/ t0r4tb89dEO0ayaj51yXFcMuwPFpmMycHbBnmL21jIN39kx+slnEv2u7Hmn5hUP5OsOi dXhw==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774343756; x=1774948556; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ZbSWPVjf0AHNIbO37TM9rPRE86YFZSdqH9qdswoKZA8=; b=kDV38cGCsviUEcXQMS2JQb0UkyIE+vI5n3M6CGwYKh+yVnRRejetyrBh9rkoUn+DxI jO9Dd/k4uRfNKApXtg+HOTzGm+BuSScEv8P0nw8MrSGu6/ycwfMMIVtFRyBQ39I6tbWD 0HHaWrCOjP/9o6+cPXFysXjpd2zGcWm2XznG3uQTCTKzxsArk5nm/QV82QjwIeEOD+/S ZkZH43YOWlDVc649DDH6SOrmvagLQAb2eDcGZkaIS5qBBrRzHDdSdoddvDE6HlgH47lt u/UrDV+BD4s072pNrV+gtbRDzmovc5tNbL9A4hGLdhufFenUT61ofvNfo2XeRhkA88oD 8rPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774343756; x=1774948556; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ZbSWPVjf0AHNIbO37TM9rPRE86YFZSdqH9qdswoKZA8=; b=MIR2dvL4snqQvepLKVoj0CccC0ivNlXMygsu5tvngP/kxmsTQsQuQoTwB/LOmPCMAg F4YF030p727blEoVt0v9XdTLojJvnZFLKEpVtYNVX4kKuH2fmWaocs96Q/pKBk4zGXzp zJyDnqZPZ72nbz2MYqZpqxE0BZRphT/lRjwByAnbBz5eqlHUErP5zllcuUcS7dbrV67S waBE69WUpFme3w+WCP3r3KmYxaOz3q9Uv+UUZ/r7b7ZBWX+JbavuPelxY+eFANBOeJrv zq2BxBY5xDHb+wOBtcaD2nYWoWF0mrkqqil6PxwVCOvPvBqFfj2wFUEcXW6pqMKwydhC eI8A== X-Forwarded-Encrypted: i=1; AJvYcCUQ8DEIGBSLR9+QCIcOX65IwYSgbTdCc6E8lKGNUvo4MgTqZoGXH8fvzBhs761z2Wk8eUuptNNEljocmPof@lists.postgresql.org X-Gm-Message-State: AOJu0Ywl/DnASgVLhhzDv4qxRozlpyuw5efM8neB0Afbhi77PGwdudcJ r+gucg/Na6b+/Szi9e6NDmeVP6buPkDdpdYMCL5Uh7FJ8pyLldiiAgCK1N7FdrDrvqgJfSDebCP da2WGI2GxU11/Cl2B/IvJxtw/YUcURlY= X-Gm-Gg: ATEYQzzGwT7ckyFw2yj4h8BD4+AdAWtjXNh/zHNPWWnl9vYhwE6m1eKQ8gEBJkbTrqE t/n0jT4Xd4rKQLNTB8nMYgt/65pHyUqaDGoBxHnrgm/LtvMQCXIIXuxXcWQpiJVqVtvPWr9MlSk Hk+ZY7G4LMyNFRFzBgHa6UPkWMnEexyGCPUv/aHu3vXI5EBuiq0mZaE9K1YhOlZgagfaL+I0wHx y1iAjEY1xq3Ya67+WsfGhUSKcQoxs8zaRN6CWtkoqsuskwK03owzoxeq3ryL/GMVUmSSn+02SEr ez8HEs/eEEBvSseWIEt4kWV+zQunXxxt9RS9R5cU132IugzGkNoG X-Received: by 2002:a05:6820:4de9:b0:67c:170c:d90a with SMTP id 006d021491bc7-67c22f856aamr12460228eaf.33.1774343756232; Tue, 24 Mar 2026 02:15:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Fujii Masao Date: Tue, 24 Mar 2026 18:15:41 +0900 X-Gm-Features: AQROBzDSuY9Gwgn0VMCmF-9QPFW5YYa0rSKLX2kWGfkzRfxbLFv4ZyG8aw2VeN4 Message-ID: Subject: Re: Use SIGTERM instead of SIGUSR1 for slotsync worker to exit during promotion? To: Nisha Moond Cc: Amit Kapila , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Mar 24, 2026 at 3:00=E2=80=AFPM Fujii Masao = wrote: > > On Tue, Mar 24, 2026 at 1:01=E2=80=AFPM Nisha Moond wrote: > > Hi Fujii-san, > > > > I tried reproducing the wait scenario as you mentioned, but could not > > reproduce it. > > Steps I followed: > > 1) Place a debugger in the slotsync worker and hold it at > > fetch_remote_slots() ... -> libpqsrv_get_result() > > 2) Kill the primary. > > 3) Triggered promotion of the standby and release debugger from slotsyn= c worker. > > > > The slot sync worker stops when the promotion is triggered and then > > restarts, but fails to connect to the primary. The promotion happens > > immediately. > > ``` > > LOG: received promote request > > LOG: redo done at 0/0301AD40 system usage: CPU: user: 0.00 s, system: > > 0.02 s, elapsed: 4574.89 s > > LOG: last completed transaction was at log time 2026-03-23 > > 17:13:15.782313+05:30 > > LOG: replication slot synchronization worker will stop because > > promotion is triggered > > LOG: slot sync worker started > > ERROR: synchronization worker "slotsync worker" could not connect to > > the primary server: connection to server at "127.0.0.1", port 9933 > > failed: Connection refused > > Is the server running on that host and accepting TCP/IP connections? > > ``` > > > > I=E2=80=99ll debug this further to understand it better. > > In the meantime, please let me know if I=E2=80=99m missing any step, or= if you > > followed a specific setup/script to reproduce this scenario. > > Thanks for testing! > > If you killed the primary with a signal like SIGTERM, an RST packet might= have > been sent to the slotsync worker at that moment. That allowed the worker = to > detect the connection loss and exited the wait state, so promotion could > complete as expected. > > To reproduce the issue, you'll need a scenario where the worker cannot de= tect > the connection loss. For example, you could block network traffic (e.g., = with > iptables) between the primary and the slotsync worker. The key is to crea= te > a situation where the worker remains stuck waiting for input for a long t= ime. Here's one way to reproduce the issue using iptables: ---------------------------------------------------- [Set up slot synchronization environment] initdb -D data --encoding=3DUTF8 --locale=3DC cat <> data/postgresql.conf wal_level =3D logical synchronized_standby_slots =3D 'physical_slot' EOF pg_ctl -D data start pg_receivewal --create-slot -S physical_slot pg_recvlogical --create-slot -S logical_slot -P pgoutput --enable-failover -d postgres psql -c "CREATE PUBLICATION mypub" pg_basebackup -D sby1 -c fast -R -S physical_slot -d "dbname=3Dpostgres" -h 127.0.0.1 cat <> sby1/postgresql.conf port =3D 5433 sync_replication_slots =3D on hot_standby_feedback =3D on EOF pg_ctl -D sby1 start psql -c "SELECT pg_logical_emit_message(true, 'abc', 'xyz')" [Block network traffic used by slot synchronization] su - iptables -A INPUT -p tcp --sport 5432 -j DROP iptables -A OUTPUT -p tcp --dport 5432 -j DROP [Promote the standby] # wait a few seconds pg_ctl -D sby1 promote ---------------------------------------------------- In my tests on master, promotion got stuck in this scenario. With the patch, promotion completed promptly. After testing, you can remove the network block with: iptables -D INPUT -p tcp --sport 5432 -j DROP iptables -D OUTPUT -p tcp --dport 5432 -j DROP Regards, --=20 Fujii Masao