Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7WY7-005PGB-1f for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 10:42:19 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7WY4-009X09-1s for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 10:42:17 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7WY4-009Wzz-0d for pgsql-hackers@lists.postgresql.org; Tue, 31 Mar 2026 10:42:16 +0000 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w7WY1-000000029rZ-2sP9 for pgsql-hackers@lists.postgresql.org; Tue, 31 Mar 2026 10:42:15 +0000 Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-2b258d93ffeso10250075ad.3 for ; Tue, 31 Mar 2026 03:42:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774953731; cv=none; d=google.com; s=arc-20240605; b=RTpNaOFB4Vuh+D+/bkbKMDpYNHEHMeeMRjj57NW2aFMzYMeMJnpKAXs2lZMVH0sCm8 0YzinYym6g5JCRSMyN/kaP8MPpkMRmExI79K30fx72jkrgHYLoW+Plj4yLR5+umkgiPH yX1xcz3SrvYh+T8UW/pFS+vLgFhjFFr/p0HH1V3ip0AaFAcLN6VNlFhGjIL5pH6L1F48 DGljv/e+Ovf2JdveBN5rsUN6T/8H8WEwKkUBSm/mkAkv5ipNhlvzSaVlNfDpeHg2/1cc yoz/tldJNI6pQArLsTqLh/xgg7A+mxRUntmSiuSQPa3vnSnVrNQEh5U0NubdDuPap334 a8ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Z6GnEQenjzumdG/Cvt4e0XkEVuHgIcErSeqPKzMy+cI=; fh=GqV6TyrmGRA4Ny2+4iRChn+clmP3jgwF8KEYBibv52U=; b=DZ5evTU2O1DcfqIcn3W6XEo/eRGcOHcFCtMIaKZpRkAPIs+vnyNhcSAw5Umtw/TIx9 VMLZ8Dsy1y/x8vZa+bPkl+oXqDE0gb+LmwzWPUIptJCwfbyHotvT6AqCjm/xyn0GoOeY GCqw/xugBSRFO05+gxXkD3MMshyJGrO/TshklsL9PxgAfwK5NkwCvyapq75teWUQUGC5 atEmIS2iddQHHl/pfjCD0oLEb4bb8+W7XeMzQe5dl+M3lw0na48cyw3/a7DqNnjxzuwK QvwTXl0YPEQSsPBt5IPjW/HpC4cMYQZJuWzobQpXZCkoRFmB7JM1n52YTRmYyeBRwpkT FiiA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774953731; x=1775558531; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Z6GnEQenjzumdG/Cvt4e0XkEVuHgIcErSeqPKzMy+cI=; b=snyj1IBNQSVdY/oIX4SpQnDt93/k0nA835aU23vW+ywDryvZYuekX8RmUiOw57m0OP MdmtnOajqlUOaH59P4HSVk4NeWk+xJctd3CWX3vCsoBEL63gKCpe/tJ68olQ1X1qk4zr vR8KRSxLKeg8gOqrBhJXaqL3o3BAc9u6EXCdny17RwT9r/w4zL5QkKHOcfyZbeY8kvuE GYuymXbK8WxG75lM8OtElr1y0v5uoe3oY+t1m8rczOn3ZXDH0cIjC8cHgelk+tPhQiqB pGbnhbPg/pEeldLRxS5MpsK7+H9P6NxImdYGmQJBUHWmG7zDwqxmJfNmgIu3R1arpiao bvaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774953731; x=1775558531; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Z6GnEQenjzumdG/Cvt4e0XkEVuHgIcErSeqPKzMy+cI=; b=AcXouyeycZODzE5ujE1qUXxi8WeISFAFbJVn2GC/S5O/r7kTBHuGhKc6wD8IApQhaU G4IEEIJRMlNVyEA5M/I1xnCD4WeBC5kbjTRYW5uY5ml5In40TDpZYUw5U83b6nvZ6Cr/ YnFg97pxFrKfOx857wmU1qEICQ+MWKEkilxwtSe3HQxIyamK0kLLoAsaB22Vimr40u7G Y0s8WUAtXAtMIgbg+tDIzhlSyj1BA1qM3erT29DzRMFGSHFtScinX5/h0da48H9rAOdE P3F0l+u5ThO9Km2LpUeZkMsXwN4wHru6tzS/UieN9D/vaauv1xVEKCnvahFkYSFzfNQh k6YQ== X-Forwarded-Encrypted: i=1; AJvYcCUkHoS/3XJaTuieAVhL9U8Aj2Jb0EX3uzTsQIDq0F26+xr2xLuHRoJW8dgSdtNDInvcn7ccZpTv2mMJXbDK@lists.postgresql.org X-Gm-Message-State: AOJu0Yy0oE/Vh7InM47uXm/WE4MxLe2/DqR44fXi6fxyin0T+KTid/eM l5oBIMadY7dr6JKpkIyGmRSeco/42i5vzG4ItbnaWmR8NHa8X2Li4SX2eLRXYLRu/SJY0iWITR2 ayqrgPMyQC3S19VKsOktuk70LXuRdfYE= X-Gm-Gg: ATEYQzyMYR7gaWX/b3xgMlw1L4aRtdIFsR9vQuhHS/M+6AIZaLZhQVoFZvoTO1WA/qw 9/MiWzmYGq9zo53Ru3gl0rjufkNMwPeCBRErSOgAQMBwocCLKkPfl5Fxys92vL56G1/pzi8zgVq qt0/mA2/Ujqt/88VEjU4AC8uHZeqfSUVE45wVzwhQXDS9PH+M8eMOfKnr/FUGlYKjf8aVyN47pi /mlABffHHB/rqF+Th2im9oSf7HDgF+iMLkWdExMlnlhfJ6XlpzAJyqfeAVLEkpVa1Z3Y9jPpwVu NL5R669u8kIcyN7ZfbtSuILqrNLoWZSEdyANxgA3mtUN/UvFzF+KDg== X-Received: by 2002:a17:903:2284:b0:2b2:51ed:4524 with SMTP id d9443c01a7336-2b251ed5756mr75161975ad.43.1774953731460; Tue, 31 Mar 2026 03:42:11 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: shveta malik Date: Tue, 31 Mar 2026 16:12:00 +0530 X-Gm-Features: AQROBzBTLxGZEBbNNdyLB-gwhU51-dcqMtd1wcsU0nnoeOTjfqZBIMpxaVvmg-Y Message-ID: Subject: Re: Use SIGTERM instead of SIGUSR1 for slotsync worker to exit during promotion? To: Nisha Moond Cc: Fujii Masao , Amit Kapila , PostgreSQL Hackers , shveta malik Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Mar 31, 2026 at 11:35=E2=80=AFAM Nisha Moond wrote: > > On Mon, Mar 30, 2026 at 4:39=E2=80=AFPM Fujii Masao wrote: > > > > On Mon, Mar 30, 2026 at 1:18=E2=80=AFPM Nisha Moond wrote: > > > We were using the same log message in two places: > > > check_and_set_sync_info() and HandleSlotSyncMessage(). > > > I think =E2=80=9Cwill not start=E2=80=9D fits better in the first cas= e, while =E2=80=9Cwill > > > stop=E2=80=9D makes sense to keep in the second. > > > > Thanks for updating the patch! > > > > With the patch, in my testing, standby promotion always produces > > the following logs: > > > > LOG: replication slot synchronization worker will stop because > > promotion is triggered > > LOG: replication slot synchronization worker will not start > > because promotion was triggered > > > > It looks like the postmaster immediately restarts the slotsync worker a= fter > > promotion terminates it, and that new worker then exits on seeing > > SlotSyncCtx->stopSignaled. > > > > IMO, always emitting both messages is a bit confusing. It would be nice= to > > suppress the second one if possible. > > > > One idea would be to prevent the restart altogether. For example, > > ProcessSlotSyncMessage() could set SlotSyncCtx->last_start_time to > > a special value (like -1), and SlotSyncWorkerCanRestart() could return > > false (i.e., prevent postmater from starting up slotsync worker) when > > it sees that. Alternatively, SlotSyncWorkerCanRestart() could simply > > check SlotSyncCtx->stopSignaled. > > > > That said, as far as I remember correctly, postmaster is generally not > > supposed to touch shared memory (per the comments in postmaster.c), > > so I'm not sure this approach is acceptable. On the other hand, > > postmaster and the slotsync worker already rely on SlotSyncCtx->last_st= art_time, > > so perhaps there's some precedent here. > > > IIUC, checking SlotSyncCtx->stopSignaled in SlotSyncWorkerCanRestart() > may not be ideal, as it requires a spinlock to avoid races with the > startup process and it is disallowed to take lock in postmaster main > loop. Whereas, SlotSyncCtx->last_start_time doesn=E2=80=99t need a lock s= ince > the postmaster accesses it only when the worker is not alive. > I agree. > Another option could be to log in check_and_set_sync_info() at DEBUG1 > instead of LOG level. This message appears only after stopSignaled is > set, when promotion is already in progress and the first worker has > logged =E2=80=9Cwill stop=E2=80=A6=E2=80=9D. The second worker doesn=E2= =80=99t do any real work. Since > there=E2=80=99s nothing actionable for users, using DEBUG1 would keep it > useful for debugging (e.g., noticing immediate restarts) while > avoiding extra log noise. Thoughts? > +1. Do you think we can slightly tweak the comment in atop file to: On promotion the startup process sets 'stopSignaled' and uses this 'pid' to signal synchronizing process with PROCSIG_SLOTSYNC_MESSAGE and also to wake it up so that the process can immediately stop its synchronizing work. Setting 'stopSignaled' on the other hand is used to handle the race condition.... Also shall we quick exit ProcessSlotSyncMessage() if syncing is already finished by API? thanks Shveta