public inbox for [email protected]help / color / mirror / Atom feed
[PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits 7+ messages / 4 participants [nested] [flat]
* [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits @ 2026-04-19 05:46 JoongHyuk Shin <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: JoongHyuk Shin @ 2026-04-19 05:46 UTC (permalink / raw) To: [email protected] <[email protected]> In ResolveRecoveryConflictWithBufferPin(), when deadlock_timeout fires, the function sends RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK and returns. The caller (LockBufferForCleanup) loops back, sets up another deadlock_timeout, and the signal gets sent again every interval. The lock-conflict path had the same problem and was fixed in 8900b5a9d59a by adding a second ProcWaitForSignal() after the deadlock-check signal. The buffer-pin path was left with an XXX comment asking "should we fix this?". The attached patch applies the same fix: after sending the deadlock-check signal, reset got_standby_deadlock_timeout and call ProcWaitForSignal() so the startup process waits for UnpinBuffer() rather than looping and re-signaling. Patch attached. Attachments: [application/octet-stream] 0001-Prevent-repeated-deadlock-check-signals-in-standby-b.patch (2.4K, 3-0001-Prevent-repeated-deadlock-check-signals-in-standby-b.patch) download | inline diff: From 58239700edf0c669f4807da6140a595c2a1e8a5e Mon Sep 17 00:00:00 2001 From: JoongHyuk Shin <[email protected]> Date: Fri, 17 Apr 2026 16:58:31 +0900 Subject: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits After sending RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK, the startup process returned without waiting, so the caller's loop would fire another deadlock_timeout and re-send the signal every interval. This added unnecessary overhead in both the startup process and backends. Fix by adding a ProcWaitForSignal() call after the deadlock-check signal, mirroring the approach already used in the lock-conflict path (commit 8900b5a9d59a). This ensures the signal is sent at most once per deadlock_timeout period rather than repeatedly. Also remove the XXX comment that noted this problem. --- src/backend/storage/ipc/standby.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index 29af7733948..9744db5715c 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -851,17 +851,19 @@ ResolveRecoveryConflictWithBufferPin(void) /* * Send out a request for hot-standby backends to check themselves for * deadlocks. - * - * XXX The subsequent ResolveRecoveryConflictWithBufferPin() will wait - * to be signaled by UnpinBuffer() again and send a request for - * deadlocks check if deadlock_timeout happens. This causes the - * request to continue to be sent every deadlock_timeout until the - * buffer is unpinned or ltime is reached. This would increase the - * workload in the startup process and backends. In practice it may - * not be so harmful because the period that the buffer is kept pinned - * is basically no so long. But we should fix this? */ SendRecoveryConflictWithBufferPin(RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK); + + /* + * Wait here to be signaled by UnpinBuffer(), to prevent the + * subsequent ResolveRecoveryConflictWithBufferPin() call (from the + * caller's loop) from firing another deadlock_timeout and re-sending + * the deadlock-check signal. Without this, the signal would be sent + * every deadlock_timeout interval until the buffer is unpinned or + * ltime is reached. + */ + got_standby_deadlock_timeout = false; + ProcWaitForSignal(WAIT_EVENT_BUFFER_CLEANUP); } /* -- 2.52.0 ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits @ 2026-04-21 05:42 Fujii Masao <[email protected]> parent: JoongHyuk Shin <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Fujii Masao @ 2026-04-21 05:42 UTC (permalink / raw) To: JoongHyuk Shin <[email protected]>; +Cc: [email protected] <[email protected]> On Sun, Apr 19, 2026 at 2:47 PM JoongHyuk Shin <[email protected]> wrote: > > In ResolveRecoveryConflictWithBufferPin(), when deadlock_timeout fires, > the function sends RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK and returns. > The caller (LockBufferForCleanup) loops back, sets up another deadlock_timeout, > and the signal gets sent again every interval. > > The lock-conflict path had the same problem and was fixed in 8900b5a9d59a > by adding a second ProcWaitForSignal() after the deadlock-check signal. > The buffer-pin path was left with an XXX comment asking "should we fix this?". > > The attached patch applies the same fix: after sending the deadlock-check > signal, reset got_standby_deadlock_timeout and call ProcWaitForSignal() > so the startup process waits for UnpinBuffer() rather than looping > and re-signaling. > > Patch attached. Thanks for the patch! LGTM. Since this change improves recovery-conflict behavior rather than fixing a bug, it doesn't seem to need backpatching and we may need to wait until v20 development opens (probably July) before committing it. While reading the patch and ResolveRecoveryConflictWithBufferPin(), I also noticed that got_standby_delay_timeout is not initialized to false before enabling the timeout. This is unrelated to the patch, and I think it is harmless in the current code, but would it be better to initialize it there, as we already do for got_standby_deadlock_timeout? if (ltime != 0) { + got_standby_delay_timeout = false; timeouts[cnt].id = STANDBY_TIMEOUT; timeouts[cnt].type = TMPARAM_AT; timeouts[cnt].fin_time = ltime; Regards, -- Fujii Masao ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits @ 2026-04-21 05:55 Michael Paquier <[email protected]> parent: Fujii Masao <[email protected]> 0 siblings, 2 replies; 7+ messages in thread From: Michael Paquier @ 2026-04-21 05:55 UTC (permalink / raw) To: Fujii Masao <[email protected]>; +Cc: JoongHyuk Shin <[email protected]>; [email protected] <[email protected]> On Tue, Apr 21, 2026 at 02:42:38PM +0900, Fujii Masao wrote: > Since this change improves recovery-conflict behavior rather than fixing a bug, > it doesn't seem to need backpatching and we may need to wait until v20 > development opens (probably July) before committing it. Yeah, this one is an improvement, not an actual bug, so let's wait for v20 if worth doing (I did not check it). -- Michael Attachments: [application/pgp-signature] signature.asc (833B, 2-signature.asc) download ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits @ 2026-04-27 11:07 JoongHyuk Shin <[email protected]> parent: Michael Paquier <[email protected]> 1 sibling, 0 replies; 7+ messages in thread From: JoongHyuk Shin @ 2026-04-27 11:07 UTC (permalink / raw) To: Michael Paquier <[email protected]>; +Cc: Fujii Masao <[email protected]>; [email protected] <[email protected]> Thanks for the review. v2 attached, with the suggested initialization added for symmetry. Agreed this is an improvement rather than a bug fix, so I've updated the CF tag to Performance accordingly. I also verified the fix locally on a primary-standby setup, using the buffer-pin conflict scenario from src/test/recovery/t/ 031_recovery_conflict.pl (aborted INSERT + cursor on standby + VACUUM FREEZE on primary). On master, strace showed 9 SIGUSR1 broadcasts to the conflicting backend over a 10-second window (one per deadlock_timeout). With the patch applied, only 1 broadcast over the same window. Patch attached. -- JoongHyuk Shin On Tue, Apr 21, 2026 at 2:55 PM Michael Paquier <[email protected]> wrote: > On Tue, Apr 21, 2026 at 02:42:38PM +0900, Fujii Masao wrote: > > Since this change improves recovery-conflict behavior rather than fixing > a bug, > > it doesn't seem to need backpatching and we may need to wait until v20 > > development opens (probably July) before committing it. > > Yeah, this one is an improvement, not an actual bug, so let's wait for > v20 if worth doing (I did not check it). > -- > Michael > Attachments: [application/octet-stream] v2-0001-Prevent-repeated-deadlock-check-signals-in-standb.patch (2.8K, 3-v2-0001-Prevent-repeated-deadlock-check-signals-in-standb.patch) download | inline diff: From e70a2fc74a63d4c1e3d1277e27a37b1e26710fff Mon Sep 17 00:00:00 2001 From: JoongHyuk Shin <[email protected]> Date: Fri, 17 Apr 2026 16:58:31 +0900 Subject: [PATCH v2] Prevent repeated deadlock-check signals in standby buffer pin waits After sending RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK, the startup process returned without waiting, so the caller's loop would fire another deadlock_timeout and re-send the signal every interval. This added unnecessary overhead in both the startup process and backends. Fix by adding a ProcWaitForSignal() call after the deadlock-check signal, mirroring the approach already used in the lock-conflict path (commit 8900b5a9d59a). This ensures the signal is sent at most once per deadlock_timeout period rather than repeatedly. Additionally, reset got_standby_delay_timeout to false before enabling STANDBY_TIMEOUT, for symmetry with the existing got_standby_deadlock_timeout reset. Remove the XXX comment that noted this problem. --- src/backend/storage/ipc/standby.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index 29af7733948..0dba1fb4289 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -818,6 +818,7 @@ ResolveRecoveryConflictWithBufferPin(void) if (ltime != 0) { + got_standby_delay_timeout = false; timeouts[cnt].id = STANDBY_TIMEOUT; timeouts[cnt].type = TMPARAM_AT; timeouts[cnt].fin_time = ltime; @@ -851,17 +852,19 @@ ResolveRecoveryConflictWithBufferPin(void) /* * Send out a request for hot-standby backends to check themselves for * deadlocks. - * - * XXX The subsequent ResolveRecoveryConflictWithBufferPin() will wait - * to be signaled by UnpinBuffer() again and send a request for - * deadlocks check if deadlock_timeout happens. This causes the - * request to continue to be sent every deadlock_timeout until the - * buffer is unpinned or ltime is reached. This would increase the - * workload in the startup process and backends. In practice it may - * not be so harmful because the period that the buffer is kept pinned - * is basically no so long. But we should fix this? */ SendRecoveryConflictWithBufferPin(RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK); + + /* + * Wait here to be signaled by UnpinBuffer(), to prevent the + * subsequent ResolveRecoveryConflictWithBufferPin() call (from the + * caller's loop) from firing another deadlock_timeout and re-sending + * the deadlock-check signal. Without this, the signal would be sent + * every deadlock_timeout interval until the buffer is unpinned or + * ltime is reached. + */ + got_standby_deadlock_timeout = false; + ProcWaitForSignal(WAIT_EVENT_BUFFER_CLEANUP); } /* -- 2.52.0 ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits @ 2026-05-18 20:32 Álvaro Herrera <[email protected]> parent: Michael Paquier <[email protected]> 1 sibling, 1 reply; 7+ messages in thread From: Álvaro Herrera @ 2026-05-18 20:32 UTC (permalink / raw) To: Michael Paquier <[email protected]>; +Cc: Fujii Masao <[email protected]>; JoongHyuk Shin <[email protected]>; [email protected] <[email protected]>; Vitaly Davydov <[email protected]> Hello, On 2026-Apr-21, Michael Paquier wrote: > On Tue, Apr 21, 2026 at 02:42:38PM +0900, Fujii Masao wrote: > > Since this change improves recovery-conflict behavior rather than fixing a bug, > > it doesn't seem to need backpatching and we may need to wait until v20 > > development opens (probably July) before committing it. > > Yeah, this one is an improvement, not an actual bug, so let's wait for > v20 if worth doing (I did not check it). Hmm, is this related to https://postgr.es/m/[email protected] ? In there, Vitaly claims to be reporting a bug that goes back to pg15, which contradicts this assessment. Regards, -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "¿Qué importan los años? Lo que realmente importa es comprobar que a fin de cuentas la mejor edad de la vida es estar vivo" (Mafalda) ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits @ 2026-05-22 08:41 JoongHyuk Shin <[email protected]> parent: Álvaro Herrera <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: JoongHyuk Shin @ 2026-05-22 08:41 UTC (permalink / raw) To: Álvaro Herrera <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected] <[email protected]>; Vitaly Davydov <[email protected]> Hello. Same function, different races, I think. Vitaly reports a missed wake-up where deadlock_timeout never fires (spurious SIGALRM from log_startup_progress_interval plus the lazy setitimer in 09cf1d52). This patch addresses the opposite, deadlock_timeout does fire, but LockBufferForCleanup loops back and re-arms it, so the signal repeats once per second. The two fixes do not overlap (the added ProcWaitForSignal sits inside the deadlock branch that Vitaly's scenario never reaches). -- JH Shin On Wed, May 20, 2026 at 7:15 AM Álvaro Herrera <[email protected]> wrote: > Hello, > > On 2026-Apr-21, Michael Paquier wrote: > > > On Tue, Apr 21, 2026 at 02:42:38PM +0900, Fujii Masao wrote: > > > Since this change improves recovery-conflict behavior rather than > fixing a bug, > > > it doesn't seem to need backpatching and we may need to wait until v20 > > > development opens (probably July) before committing it. > > > > Yeah, this one is an improvement, not an actual bug, so let's wait for > > v20 if worth doing (I did not check it). > > Hmm, is this related to > https://postgr.es/m/[email protected] ? > In there, Vitaly claims to be reporting a bug that goes back to pg15, > which contradicts this assessment. > > Regards, > > -- > Álvaro Herrera Breisgau, Deutschland — > https://www.EnterpriseDB.com/ > "¿Qué importan los años? Lo que realmente importa es comprobar que > a fin de cuentas la mejor edad de la vida es estar vivo" (Mafalda) > ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits @ 2026-05-29 06:30 Michael Paquier <[email protected]> parent: JoongHyuk Shin <[email protected]> 0 siblings, 0 replies; 7+ messages in thread From: Michael Paquier @ 2026-05-29 06:30 UTC (permalink / raw) To: JoongHyuk Shin <[email protected]>; +Cc: Álvaro Herrera <[email protected]>; Fujii Masao <[email protected]>; [email protected] <[email protected]>; Vitaly Davydov <[email protected]> On Fri, May 22, 2026 at 05:41:03PM +0900, JoongHyuk Shin wrote: > This patch addresses the opposite, > deadlock_timeout does fire, but LockBufferForCleanup loops back and re-arms > it, so the signal repeats once per second. Right. I don't really see why this should be backpatched. One argument would be more consistency of this area of the code across all the stable branches, but the argument is kind of moot as this does not fix a problem, just improves a bit what we have. -- Michael Attachments: [application/pgp-signature] signature.asc (833B, 2-signature.asc) download ^ permalink raw reply [nested|flat] 7+ messages in thread
end of thread, other threads:[~2026-05-29 06:30 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-04-19 05:46 [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits JoongHyuk Shin <[email protected]> 2026-04-21 05:42 ` Fujii Masao <[email protected]> 2026-04-21 05:55 ` Michael Paquier <[email protected]> 2026-04-27 11:07 ` JoongHyuk Shin <[email protected]> 2026-05-18 20:32 ` Álvaro Herrera <[email protected]> 2026-05-22 08:41 ` JoongHyuk Shin <[email protected]> 2026-05-29 06:30 ` Michael Paquier <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox