Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Alexander Lakhin <[email protected]>
To: Michael Paquier <[email protected]>
To: Tom Lane <[email protected]>
Cc: Iwata, Aya/岩田 彩 <[email protected]>
Cc: Peter Smith <[email protected]>
Cc: Kuroda, Hayato/黒田 隼人 <[email protected]>
Cc: Pavel Stehule <[email protected]>
Cc: Chao Li <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE
Date: Tue, 31 Mar 2026 10:00:00 +0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<OS3PR01MB8889505E2F3E443CCA4BD72EEA45A@OS3PR01MB8889.jpnprd01.prod.outlook.com>
	<[email protected]>
	<[email protected]>
	<TYRPR01MB12156CA129F54361D19ECAB56F541A@TYRPR01MB12156.jpnprd01.prod.outlook.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

Hello Michael,

21.03.2026 04:46, Michael Paquier wrote:
> So we are able to send the requests to the workers, and these can take
> a long time before being processed by the postmaster.  Querying
> directly "postgres" for the worker_spi_launch() and pg_stat_activity
> queries seems to have reduced the friction, with less requests to
> send.   However, I don't think that this is the end of the story, even
> after 79a5911fe65b I have spotted one case of RENAME TO where the
> requests were sent for a bit more than 4s, before the postmaster had
> the idea to catch up.  RENAME TO is the only one that can get slow
> (really no idea why), so I guess that we could always tweak things a
> bit more:
> 1) Extra injection point to increase the timeout (30s or 60s?) and
> give the postmaster more room to proceed the requests.
> 2) Remove this portion of the test, but it would be sad.
>
> I'll keep an eye for more failures, even if the situation is looking
> slightly better.

Having reproduced this locally (running 3 tests in parallel with
ALTER DATABASE RENAME repeated 200 times, on a slow riscv64 machine), I
discovered that in the bad case the worker doesn't reach the main loop in
time (and CHECK_FOR_INTERRUPTS() inside it), because it doesn't get out of
initialize_worker_spi() -> CommitTransactionCommand().

With this modification:
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3752,3 +3752,3 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
          */
-       int                     ntries = 50;
+       int                     ntries = 500;

@@ -3798,3 +3798,6 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
                 if (!found)
+{
+elog(LOG, "!!!CountOtherDBBackends| found no backends, try %d", tries);
                         return false;           /* no conflicting backends, so done */
+}

I can see the following:
... !!!CountOtherDBBackends| found no backends, try 1
# most of the calls (200 of 201) succeeded with try 1, but there are also:
... !!!CountOtherDBBackends| found no backends, try 7
... !!!CountOtherDBBackends| found no backends, try 51
... !!!CountOtherDBBackends| found no backends, try 74
... !!!CountOtherDBBackends| found no backends, try 84

So the backend is not completely stuck, but CommitTransactionCommand()
may take more than 5 seconds under some circumstances (maybe it's worth
investigating which exactly).

Best regards,
Alexander

view thread (70+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox