Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w31eU-000pLH-38 for pgsql-hackers@arkaria.postgresql.org; Thu, 19 Mar 2026 00:54:19 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w31eT-00FjW5-2U for pgsql-hackers@arkaria.postgresql.org; Thu, 19 Mar 2026 00:54:17 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w31eS-00FjVt-2e for pgsql-hackers@lists.postgresql.org; Thu, 19 Mar 2026 00:54:17 +0000 Received: from fout-a4-smtp.messagingengine.com ([103.168.172.147]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w31eO-00000000SCq-3e1B for pgsql-hackers@postgresql.org; Thu, 19 Mar 2026 00:54:15 +0000 Received: from phl-compute-12.internal (phl-compute-12.internal [10.202.2.52]) by mailfout.phl.internal (Postfix) with ESMTP id 1020AEC0188; Wed, 18 Mar 2026 20:54:13 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-12.internal (MEProxy); Wed, 18 Mar 2026 20:54:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paquier.xyz; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1773881653; x=1773968053; bh=pVr0mdtJQL QRYsnOB5muQRkHXm4G+tIdeQF3kXXIjvs=; b=o1PPi3Jly8vcyd2BsYapGJcuxH kKdSyhdqAoTIxyO26hJU+cb3CP7PYVeiegpGDFly/sSG3MALCzqZSR6GLiHtizaY KFEGUqfCxcRUSrN5Pt/TXVigQZR+FYiPW/Bx8GTRulJeVKGxT1fKP0G+V6AS/uvn CEJbcmXMOOSF/xU1WqTPJq+08dv209xlW4zGfUb4KndjzUrbMhphOXkkCj3zt70O lp/8C7uOg60ndwDDwUdcBo5+zWvMnw0yUM7VaA6vA4WCbjyYi+BaCvOG8Vk9PkAr ukJ1C2ZWuVhrTrUoHHZumRw1Jlas58e3gOQUceq0H7AEWsFyaSV54jD1kYYQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1773881653; x=1773968053; bh=pVr0mdtJQLQRYsnOB5muQRkHXm4G+tIdeQF 3kXXIjvs=; b=ZMBeUPrR2HKsZfW8vPdAnTGwuYwgxm3FpDMxAp/dKFjKGnq3KWy P4XEnBFW8nbuZXR7mxj3dvaJe3TNdThDFHlHs6E1px8l0q+dAyPFiGlHX6V66RSu vlJlEddUhyzg6jJhk22AxUGN7dGiN7lCa39JvuGkzfgUxRXaY5MqXaaGf4zupGTo +WsDd+k6631H0yICddAcuv+FZweOuA/UemEv0qJokSkEkBK8ggxgElYKFDq1kYKq 3h7AHyDdsJ2jqaqdRZdWyYviZz869MbQZqnB0poaErn8NFZ0vSEnYSX59wHfSzNW xSDg3kCXMNkD6S1q+14tlHTmt39a3UXR8PA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdeftdehiedvucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnegfrh hlucfvnfffucdlfeehmdenucfjughrpeffhffvvefukfhfgggtuggjsehgtderredttddv necuhfhrohhmpefoihgthhgrvghlucfrrghquhhivghruceomhhitghhrggvlhesphgrqh huihgvrhdrgiihiieqnecuggftrfgrthhtvghrnhepgeffjeevgfevuddvjedtvddtieej heduueelvddufedtgfefjedvkeevkeeivddvnecuffhomhgrihhnpehpohhsthhgrhgvsh hqlhdrohhrghenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhr ohhmpehmihgthhgrvghlsehprghquhhivghrrdighiiipdhnsggprhgtphhtthhopeekpd hmohguvgepshhmthhpohhuthdprhgtphhtthhopehtghhlsehsshhsrdhpghhhrdhprgdr uhhspdhrtghpthhtohepvgigtghluhhsihhonhesghhmrghilhdrtghomhdprhgtphhtth hopehifigrthgrrdgrhigrsehfuhhjihhtshhurdgtohhmpdhrtghpthhtohepshhmihht hhhpsgdvvdehtdesghhmrghilhdrtghomhdprhgtphhtthhopehkuhhrohgurgdrhhgrhi grthhosehfuhhjihhtshhurdgtohhmpdhrtghpthhtohepphgrvhgvlhdrshhtvghhuhhl vgesghhmrghilhdrtghomhdprhgtphhtthhopehlihdrvghvrghnrdgthhgrohesghhmrg hilhdrtghomhdprhgtphhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehpohhsthhgrhgv shhqlhdrohhrgh X-ME-Proxy: Feedback-ID: i0fe9450f:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 18 Mar 2026 20:54:09 -0400 (EDT) Date: Thu, 19 Mar 2026 09:54:04 +0900 From: Michael Paquier To: Tom Lane Cc: Alexander Lakhin , =?utf-8?B?SXdhdGEsIEF5YS/lsqnnlLAg5b2p?= , Peter Smith , =?utf-8?B?S3Vyb2RhLCBIYXlhdG8v6buS55SwIOmavOS6ug==?= , Pavel Stehule , Chao Li , pgsql-hackers Subject: Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE Message-ID: References: <1020519.1773863522@sss.pgh.pa.us> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="NWxyWim0VnEPljMe" Content-Disposition: inline In-Reply-To: <1020519.1773863522@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --NWxyWim0VnEPljMe Content-Type: multipart/mixed; boundary="0x1+I/ybC2FuD2az" Content-Disposition: inline --0x1+I/ybC2FuD2az Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Mar 18, 2026 at 03:52:02PM -0400, Tom Lane wrote: > which makes me wonder whether the problematic session is the second or > third bgworker. I am not seeing entries indicating that those > stopped, as there is for the first bgworker. Looking at the logs produced at [1], the worker launched as number 1 would not be able to interact, it connects to the database postgres, under PID 1616001, and is reported as exited by the postmaster. The only interacting sessions would be: 1) The bgworker launched as number 2, connected to database testdb. 2) The session checking for pg_stat_activity, launched by launch_bgworker(). The test was connected with the database we want to rename, and this could interact as an extra session. This query could be run while connected to the database postgres to reduce the friction and discarding this one. The timestamps of the logs tell that it takes 5 seconds for this host to get out of the ALTER DATABASE .. RENAME TO, which implies that we are looping inside CountOtherDBBackends() for 5 seconds. So it really looks like the second bgworker is the one we are waiting for here. Now, we are sure of the following things when we try to launch the RENAME TO: - The worker is seen in pg_stat_activity. - The worker is already in worker_spi_main(), per its "LOG initialized with" entry. - The worker is connected to the database. - The worker can receive signals. How would it be possible for this worker to not receive the requests? The only thing I could think of is that the postmaster does not have the time to process the PMSIGNAL_BACKGROUND_WORKER_CHANGE requests? The next thing would be to gather more data, I guess. The attached would help in providing more information. If it happens that we are able to send the requests and that the postmaster does not have the time to process them, I don't really see what we can do except: - Drop the portion of the tests for DROP DATABASE, SET TABLESPACE and RENAME DB, because all these scenarios involve commands that work on the same database as the worker connected, and if the postmaster does not have the time to process the termination requests, I don't really see what we could do. This could also point to a timing issue with the feature in itself, of course. - Revert the feature, stop playing with the buildfarm due to the end of the release cycle, and rework it for v20. For now I am planning for the attached to get more information from widowbird, which should take a few days at worst. That would make clear if we have a timing issue with the requests sent to the postmaster. Launching the queries for worker_spi_launch() and pg_stat_activity on the database postgres may also improve things, but I don't really buy it, even if I may be wrong. [1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=widowbird&dt=2026-03-17%2015%3A35%3A03 -- Michael --0x1+I/ybC2FuD2az Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=0001-Add-more-debugging-information-for-termination-tests.patch Content-Transfer-Encoding: quoted-printable =46rom adb474866f12a66ac48c702d9ab25ce82277abca Mon Sep 17 00:00:00 2001 =46rom: Michael Paquier Date: Thu, 19 Mar 2026 09:51:16 +0900 Subject: [PATCH] Add more debugging information for termination tests of worker_spi --- src/backend/postmaster/bgworker.c | 6 ++++++ src/test/modules/worker_spi/t/002_worker_terminate.pl | 10 ++++++++-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgw= orker.c index 0104a86b9ecd..fd678ef2596d 100644 --- a/src/backend/postmaster/bgworker.c +++ b/src/backend/postmaster/bgworker.c @@ -1413,6 +1413,9 @@ TerminateBackgroundWorkersForDatabase(Oid databaseId) { bool signal_postmaster =3D false; =20 + elog(DEBUG1, "attempting worker termination for database %u", + databaseId); + LWLockAcquire(BackgroundWorkerLock, LW_EXCLUSIVE); =20 /* @@ -1432,6 +1435,9 @@ TerminateBackgroundWorkersForDatabase(Oid databaseId) { slot->terminate =3D true; signal_postmaster =3D true; + + elog(DEBUG1, "termination requested for worker (PID %d) on database %u= ", + slot->pid, databaseId); } } } diff --git a/src/test/modules/worker_spi/t/002_worker_terminate.pl b/src/te= st/modules/worker_spi/t/002_worker_terminate.pl index 6db80ffec88c..b0e6a5376d4c 100644 --- a/src/test/modules/worker_spi/t/002_worker_terminate.pl +++ b/src/test/modules/worker_spi/t/002_worker_terminate.pl @@ -24,7 +24,7 @@ sub launch_bgworker =20 # Launch a background worker on the given database. my $pid =3D $node->safe_psql( - $database, qq( + 'postgres', qq( SELECT worker_spi_launch($testcase, '$database'::regdatabase, 0, '= {}', $interruptible); )); =20 @@ -32,7 +32,7 @@ sub launch_bgworker $node->wait_for_log( qr/LOG: .*worker_spi dynamic worker $testcase initialized with .*\..*/, $offset); - my $result =3D $node->safe_psql($database, + my $result =3D $node->safe_psql('postgres', "SELECT count(*) > 0 FROM pg_stat_activity WHERE pid =3D $pid;"); is($result, 't', "dynamic bgworker $testcase launched"); =20 @@ -52,6 +52,11 @@ sub run_bgworker_interruptible_test qr/terminating background worker \"worker_spi dynamic\" due to administr= ator command/, $offset); =20 + # Postmaster entry reporting the worker as exiting. + $node->wait_for_log( + qr/LOG: .*background worker \"worker_spi dynamic\" \(PID $pid\) exited w= ith exit code/, + $offset); + my $result =3D $node->safe_psql('postgres', "SELECT count(*) =3D 0 FROM pg_stat_activity WHERE pid =3D $pid;"); is($result, 't', "dynamic bgworker stopped for $testname"); @@ -63,6 +68,7 @@ $node->append_conf( "postgresql.conf", qq( autovacuum =3D off debug_parallel_query =3D off +log_min_messages =3D debug1 )); $node->start; =20 --=20 2.53.0 --0x1+I/ybC2FuD2az-- --NWxyWim0VnEPljMe Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEG72nH6vTowiyblFKnvQgOdbyQH0FAmm7SSwACgkQnvQgOdby QH2DDQ//dmu3nlRfifA9rsg0M+t5h4BZmEJFaCZ/X8If1MG4YcE/JWvLwsIsZZ1m TnOrqxBqyxMHlauGIGtttgP2+6RUpxTv5SjnyXG3tJvl3CLxLfXEKy+T7NH3PRih kByFzFY6spaLRZGGjVaVMXBtb8BXBzcxxECdfUI55B3Re4DIE+9XZtdh5GfoOhMJ Yc/NlIvXr7XNqL2n8dc1BMWyM1cBFFzq0kGLvIdEjn1Kfzl6VoNkxnDcmkOoN7mv FnZpt5w+UBCcuch5cyNQd4HjOtcJebm310YCrj9xJ840Z+RLFusERy+Ex8YGOKVZ 5q49m7kGmbZuaSs/Zj2hZ8QxrxBmSRzQ6vPDUm79z/n0lKKmq7vPAJG7PZ4hnqxL pmvZgCLycgNgkdc5pZbegqLhjYcmUZqYXQg5yKXwB0cbRxF7NOcUcK7qv7W663+N 6GE8fWQqUFooKPuXUNNKS3EMLw4r0T/tVAaWY0oKLbpihJ1A1vGEQF4ofHeh7E8X POqgnnZnJ9QIq7H+xk9HSwsUQU92rfZRWQBOblJWqNjp13n0hAVDS6VABQNna44M Qe77eziC4oFyv/laHz/Gn4Q2VAnuDx+NDHzLNYGWVACvVzrG73ZDl2PrKAABU6UE VAgpwG65IWtQ2JYoD9p6D7qTVTldLek8pd6c1/If1OcKsmWOL1E= =MCpP -----END PGP SIGNATURE----- --NWxyWim0VnEPljMe--