Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w4uoj-002ggo-1e for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Mar 2026 06:00:41 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w4uoh-004NVM-2X for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Mar 2026 06:00:40 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w4uoh-004NVD-1h for pgsql-hackers@lists.postgresql.org; Tue, 24 Mar 2026 06:00:39 +0000 Received: from mail-ot1-x332.google.com ([2607:f8b0:4864:20::332]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w4uof-00000000k4T-41iY for pgsql-hackers@lists.postgresql.org; Tue, 24 Mar 2026 06:00:38 +0000 Received: by mail-ot1-x332.google.com with SMTP id 46e09a7af769-7d74aa6bcdbso1948401a34.2 for ; Mon, 23 Mar 2026 23:00:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774332037; cv=none; d=google.com; s=arc-20240605; b=RtHU75xzLqHwRR3RZUJLI6IvKzZOPckqDkcTNHh8iwpXpZtZFdK2RBF4cGXA1qhNJA /1Tv7CGl4SroS6lfth/pZJQtQUlLtZthl01Tlsj7Spj0pgYY2GJqO4ZTscKFyNIs8TjQ d9/zU1MsqgoJlrtYAh+kv/HziszkGD2dQCJKJIUx54DyzKoDOIgD8np72X16CMsnqMXL ZWLFaQYRb7RGCZkN3UaFAUHoVvjEoTyt2vK4WKZK8foeLV0XzcNetg79Jv5UCco08HWd 5XOs3ZzXL8NhjBiKBqPcJnkToEI9Pu6pDmtKdv15wB2mozBsLrLzRCno6pf+BTm4bfQf 5Jfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=sbdg8/BNXWvr9nBsRdFxZAeic7VaZVJovEYAcjxojZo=; fh=494ojTgJxcdtfr6MOdKiEobK9+6pDAE1XZpp+l0fkDE=; b=Tb2S63Sv4PobCHIQ5vXTyMnY6jLSrCC2mWuW60rQfMrXH44sV7Z0Gpo+mDZuL5DR2j YKLWzaoLh2MYhUz1Dg5aI+jQiTMsvO9Ipg0yC66VwKE4o/Ja6ucvMzOTCxNIPFqU2S/4 cKPt3f3HXjNVDhQPeli7IRB9+ok9haynEVjkIs9Qhm8t/hPtbPuuvvJO3UiYVD1RLVR5 aGe/TEzazLehKnjqAk/GkRKIiJfQSWQxcaZpsf4kbllnUe1UTQTunC/l8UIHyFPB1JqW TA68NtXaBAkpJ30IYt1E2D20slK1kX/fnrQBU25gIhNriJD9RC7sTzpwO/PBuKg6I1Bo rI9w==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774332037; x=1774936837; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sbdg8/BNXWvr9nBsRdFxZAeic7VaZVJovEYAcjxojZo=; b=NqZan5zWrpWt2pbFr45ie2fleGWcYfx5IsGnw5p9capyMqjMgs1Yk6Xot1LZ2ZD1Lr q27+pRpVOT8KMkAfPUv6wZR/8+dZaRk28jD4jp1RB1ntlr7tKMIUIOFf7YAsFdQT1Z9w J2MtnWlvguD8R6y5eTjBComRP8NzTBBwdEw7mNmjO9pcabtwZQTc5bIcz2mokbsVgP/z TZB+vXJ+FOO/CuRF1vGoGkhg2nhXmf5xeGk44fmzJrON8iNR1OQvK7HfEbleP5V22bYC uGYtwhMWEOF7MlQDlxL/kWo+z5Z1hvuL1Jvs5p19nOGknFTOmVNH0VW+Z2X9j/evgL6d 7GpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774332037; x=1774936837; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=sbdg8/BNXWvr9nBsRdFxZAeic7VaZVJovEYAcjxojZo=; b=oCaN95eTbmo/zuEMBgo1ZhSS/nWia2oqmc00r+BkMtDf4zBoyULuwc+ChAQUhIZ03J QDrJI0a5EWfZHCXGiHC0+KKBRdM50/lqqilyKnXhN+ICq7TB4tlPBswuH2VuT+w523Ww gCsSX7nRBE9Nm6UwhqXb/OKzHRCEuPpbkf/Ej1650lcztomZcE3jDKgQBtvQ021LsD9n MAg1tHcvZzB9r7wBb7ovZqIzRHJetAdlwiCTBG/F6md4OzNDr8mANXQZQnMl8u36xnIo XiawUQA+qKRCv9G3FphvlE5yRApFjySo4a9kH0iBgPj1ZvYbzRXGfcOsAInrItSM/e/v xjPg== X-Forwarded-Encrypted: i=1; AJvYcCU8e67AsHeA22iazB3Zlzm87DXFEwQrhUqaj1i8BmoXkM0q6SXSbc6Cul685EFEVYGAfpHt/2Mplr467XU/@lists.postgresql.org X-Gm-Message-State: AOJu0YwjmJKx8u2bsn9iDKi1rb7PaBOmTZ/X5DsabMCZ8IUaVUZcuk4D VHLf3TC093ha2H6oaaNr4PU8/J/jleCoiu6ekdzcZjQmQMJDVLWv5vMQBEuEziQnuw5Bn7iWDaY 9abwZTJLCYtEuWjK8VJd882bDaa4txfA= X-Gm-Gg: ATEYQzzJP3UnJK4YWGaD0i4vjsyYI8D4Go7MRoSrj0NTdkKIhXEuo7YvmzeFLY/7tlt sdRXxfpP+Z1pNWKJNmSiqtTP10HOv8Z6inkgA6y1Urf7pAFnw3/+R4nvOo0ozSIYBeOJteEGfjq bEBCwzT0Tqx17pCDKzmTL5aApsqV3xI3xz05DXjKrwMj7DkQOpgClE/oVmzIvyMrcx6F+9SEqJd kCcu/LBD3J3ZOGd8rZ2ZIvNmOXh3AjU3TxI/ChCB/aFqjBrpv/jaPyaAVMIvYLOMkXGrIVpfQzm fUKE5zI/j6LknqPaAyDfJHeYnDFDZjkDkQeqyRM= X-Received: by 2002:a05:6820:3088:b0:67d:f168:2210 with SMTP id 006d021491bc7-67df168241bmr2532528eaf.61.1774332033128; Mon, 23 Mar 2026 23:00:33 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Fujii Masao Date: Tue, 24 Mar 2026 15:00:20 +0900 X-Gm-Features: AQROBzCi-wH6V1z2ZkqYRZ1qoSKJMOrL3MPccSATnCUWOmbf1mwT9smzL6Z87S0 Message-ID: Subject: Re: Use SIGTERM instead of SIGUSR1 for slotsync worker to exit during promotion? To: Nisha Moond Cc: Amit Kapila , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Mar 24, 2026 at 1:01=E2=80=AFPM Nisha Moond wrote: > Hi Fujii-san, > > I tried reproducing the wait scenario as you mentioned, but could not > reproduce it. > Steps I followed: > 1) Place a debugger in the slotsync worker and hold it at > fetch_remote_slots() ... -> libpqsrv_get_result() > 2) Kill the primary. > 3) Triggered promotion of the standby and release debugger from slotsync = worker. > > The slot sync worker stops when the promotion is triggered and then > restarts, but fails to connect to the primary. The promotion happens > immediately. > ``` > LOG: received promote request > LOG: redo done at 0/0301AD40 system usage: CPU: user: 0.00 s, system: > 0.02 s, elapsed: 4574.89 s > LOG: last completed transaction was at log time 2026-03-23 > 17:13:15.782313+05:30 > LOG: replication slot synchronization worker will stop because > promotion is triggered > LOG: slot sync worker started > ERROR: synchronization worker "slotsync worker" could not connect to > the primary server: connection to server at "127.0.0.1", port 9933 > failed: Connection refused > Is the server running on that host and accepting TCP/IP connections? > ``` > > I=E2=80=99ll debug this further to understand it better. > In the meantime, please let me know if I=E2=80=99m missing any step, or i= f you > followed a specific setup/script to reproduce this scenario. Thanks for testing! If you killed the primary with a signal like SIGTERM, an RST packet might h= ave been sent to the slotsync worker at that moment. That allowed the worker to detect the connection loss and exited the wait state, so promotion could complete as expected. To reproduce the issue, you'll need a scenario where the worker cannot dete= ct the connection loss. For example, you could block network traffic (e.g., wi= th iptables) between the primary and the slotsync worker. The key is to create a situation where the worker remains stuck waiting for input for a long tim= e. Regards, --=20 Fujii Masao