Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7T57-005LYc-1J for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 07:00:09 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7T55-008GVX-1e for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 07:00:07 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7T55-008GVP-0k for pgsql-hackers@lists.postgresql.org; Tue, 31 Mar 2026 07:00:07 +0000 Received: from mail-wr1-x42d.google.com ([2a00:1450:4864:20::42d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w7T54-00000001vey-0BGD for pgsql-hackers@postgresql.org; Tue, 31 Mar 2026 07:00:06 +0000 Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-43cf5ad500fso2300355f8f.0 for ; Tue, 31 Mar 2026 00:00:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774940403; x=1775545203; darn=postgresql.org; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:from:user-agent:mime-version:date:message-id:from:to :cc:subject:date:message-id:reply-to; bh=Ky9qBwGC7Sz+4F68HJ3cm9SkJCanTq8rITwPnwHCXJc=; b=TH1oUwJZVf6zOAn1PpQBjsq/r3w1lqxjBE1xusGZ2UYo1wyV4TgURPpV+JCCxjlTko xd9M7lvTgToN2zyrWDCifshwSxp0x2j2l1zMuJ8OHUDScgLN1Uk7M3pMIsZBj14Y7/DG hKT8NurICUCIgVuFeKOMdMAuvd/xDtACjwxSbOIKSL9fzeP6MUeX8PHoZqPV8Xdz+XEN JxDHo51qdpqA1tsFS9mvg0ILk5QHoQQzpBl3dedOQmFxTzR4GBJxVEqpTswbLtiKLwNN b37+XkkNAdv2hFSEv9F3J1lQjOlhOY50OLlTCk1DTLQ10WRIzCT5xdjZjPubYh/H6GlL Hjyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774940403; x=1775545203; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:from:user-agent:mime-version:date:message-id:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ky9qBwGC7Sz+4F68HJ3cm9SkJCanTq8rITwPnwHCXJc=; b=lrcLUj4K4HSzS64QHNg7I2TxeEdLCXUxqDgqKWB+CFIFJ8oKoTYpdRLrKcS7MFJY6h QOXrGgZERJmWBRuSjgJyKfsvcsi0deKSYX8ORLAGdNt3GH3QfXw3nl36XZYhu0Z2bQCU 5ZWh5BiHD4zdkAC83HGX4oJauIwj2wBjbQ0IPFIhfdzFPwHK3v/07JHfs0U/S0W3ZJUZ kesMCE2X/tBhsbFSNPc7+q6/fAbvgQOQ75EjH/7VcDPKJrpEVAO10cxc34+XG/UDvIq+ LCRRFxfiycX/pIOKRjeJay2ZIGUS2tTZ8ZqKgj6NfoKxbRikkqOKnnAZ/pyHLlu2sFuo Ttpw== X-Forwarded-Encrypted: i=1; AJvYcCUnXm5G1vFVr3nn9j+WhpmxBXkcJp/K8ux5d9OlVlBSQqTzM/qAc/C3bBP4Wa2o4ecxHumTb5pLfcerz1PX@postgresql.org X-Gm-Message-State: AOJu0YzmB0A7eW1PE9/jFwMaed6IG6iPlXAShV1Q5txELoiM/P/s6sb0 9HjpA+VTUmSgP8ElEeJkUdJoSD7NtCMaTVPjHQEBX9NLgJXWdiBspL/+ X-Gm-Gg: ATEYQzyOLvf77dQd7U84GcClcgB6P8VGM6dQlcUFGHabJ1fgTmSGx29fN4gTZCxPImT Tod8iLYAoBoJzfVBVCLb+5mZMiDRC8JD78wv9yWuP/wGcIBRtbAXmYuCOGdpr/53M7SRe5L7TBV 5U1v9E2+m9QCo4hvDIuxXdtIcJktBlj4XbWlrsGPQevzFp8pf+av8cBiixtuLHazJHxLJvsIVUa /M38IUM7+ymdg8lRvN59HfCWNBSWeCAhC2pDdtIPtDHtgCjWIv7NDlETI5PDv9mhGO10Ii92VZx cgyNVgiDB7zauT3WHuDmmeC6OIrqptaUw+XrpEzFiS0xPwRv1nWUoAX7I3shJnmw5rCpBA8AABr FS+hD3Mv7ZF3OVYn2xrQxiel5dKww+IZDuk3bcLCxYfkUdnJpqIwcv9DZ8MD0jpi7+4nUgBWveP VQjODUC/WvPA== X-Received: by 2002:a5d:5d10:0:b0:439:ae2a:755e with SMTP id ffacd0b85a97d-43b9e9a9785mr26897751f8f.23.1774940402939; Tue, 31 Mar 2026 00:00:02 -0700 (PDT) Received: from [192.168.0.50] ([89.149.68.143]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43cf21f1749sm25073533f8f.14.2026.03.31.00.00.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 31 Mar 2026 00:00:02 -0700 (PDT) Message-ID: Date: Tue, 31 Mar 2026 10:00:00 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Alexander Lakhin Subject: Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE To: Michael Paquier , Tom Lane Cc: =?UTF-8?B?SXdhdGEsIEF5YS/lsqnnlLAg5b2p?= , Peter Smith , =?UTF-8?B?S3Vyb2RhLCBIYXlhdG8v6buS55SwIOmavOS6ug==?= , Pavel Stehule , Chao Li , pgsql-hackers References: <1020519.1773863522@sss.pgh.pa.us> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hello Michael, 21.03.2026 04:46, Michael Paquier wrote: > So we are able to send the requests to the workers, and these can take > a long time before being processed by the postmaster. Querying > directly "postgres" for the worker_spi_launch() and pg_stat_activity > queries seems to have reduced the friction, with less requests to > send. However, I don't think that this is the end of the story, even > after 79a5911fe65b I have spotted one case of RENAME TO where the > requests were sent for a bit more than 4s, before the postmaster had > the idea to catch up. RENAME TO is the only one that can get slow > (really no idea why), so I guess that we could always tweak things a > bit more: > 1) Extra injection point to increase the timeout (30s or 60s?) and > give the postmaster more room to proceed the requests. > 2) Remove this portion of the test, but it would be sad. > > I'll keep an eye for more failures, even if the situation is looking > slightly better. Having reproduced this locally (running 3 tests in parallel with ALTER DATABASE RENAME repeated 200 times, on a slow riscv64 machine), I discovered that in the bad case the worker doesn't reach the main loop in time (and CHECK_FOR_INTERRUPTS() inside it), because it doesn't get out of initialize_worker_spi() -> CommitTransactionCommand(). With this modification: --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -3752,3 +3752,3 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)          */ -       int                     ntries = 50; +       int                     ntries = 500; @@ -3798,3 +3798,6 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)                 if (!found) +{ +elog(LOG, "!!!CountOtherDBBackends| found no backends, try %d", tries);                         return false;           /* no conflicting backends, so done */ +} I can see the following: ... !!!CountOtherDBBackends| found no backends, try 1 # most of the calls (200 of 201) succeeded with try 1, but there are also: ... !!!CountOtherDBBackends| found no backends, try 7 ... !!!CountOtherDBBackends| found no backends, try 51 ... !!!CountOtherDBBackends| found no backends, try 74 ... !!!CountOtherDBBackends| found no backends, try 84 So the backend is not completely stuck, but CommitTransactionCommand() may take more than 5 seconds under some circumstances (maybe it's worth investigating which exactly). Best regards, Alexander