Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8GCF-000MFI-1B for pgsql-hackers@arkaria.postgresql.org; Thu, 02 Apr 2026 11:26:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w8GCD-005SU6-2u for pgsql-hackers@arkaria.postgresql.org; Thu, 02 Apr 2026 11:26:46 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8GCD-005STv-1X for pgsql-hackers@lists.postgresql.org; Thu, 02 Apr 2026 11:26:46 +0000 Received: from relay4-d.mail.gandi.net ([2001:4b98:dc4:8::224]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w8GCB-00000000Amw-0qml for pgsql-hackers@postgresql.org; Thu, 02 Apr 2026 11:26:44 +0000 Received: by mail.gandi.net (Postfix) with ESMTPSA id D91A23EE12; Thu, 2 Apr 2026 11:26:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vondra.me; s=gm1; t=1775129199; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U+ZIjM1yqNPJhSwYGGH7jLycVuZ5p5VwqXVsGV3GR1o=; b=YfEUbdfa+zWwMR8R+H0+wz8cKKznVwp3T+IHogPE6ZSOE7FkcoBas0mlOvDhwT4pcOCqNk tbfDXfmOdIWpVOOQHkWwDDgMnOcpy/SCz63V3uc6dZBrE/fQi6vib7GlDaeHveRE22B5GX MwS2c1QiZluJo3Gpdxa5HKhZxvIedLnGejlSjIDLNbV07gYgg+FTLqSJ+rqYqsjX9zmM8A RW5F64X6zbsb6sXVA8LuxKayH7zoYmJMUi8wW22dzIfb2lzkWMgmiJVYrfTIA8G/DL2xtG D2/G065A6f9aE/lbOUn6fEAQ8Eyv2uqGMgnDCaEeHKtgp0savmqObV5PGcy5cA== Message-ID: <1b796dc7-a873-4249-8c35-7ea7b9772904@vondra.me> Date: Thu, 2 Apr 2026 13:26:32 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE To: Alexander Lakhin , Tom Lane , Tomas Vondra Cc: Michael Paquier , =?UTF-8?B?SXdhdGEsIEF5YS/lsqnnlLAg5b2p?= , Peter Smith , =?UTF-8?B?S3Vyb2RhLCBIYXlhdG8v6buS55SwIOmavOS6ug==?= , Pavel Stehule , Chao Li , pgsql-hackers References: <1020519.1773863522@sss.pgh.pa.us> <3074140.1775074810@sss.pgh.pa.us> Content-Language: en-US From: Tomas Vondra In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-GND-Sasl: tomas@vondra.me X-GND-State: clean X-GND-Score: -100 X-GND-Cause: dmFkZTEIaRstmEYHcoyd1O/gEMysubGzuhBcwNXoDWrkozcfT7TkcbPeDMpwrncZ/xloejbKQmZRR+h2XC8VNg5J/E6NfgVaaH1hS7V2TN8Okmw+UeVVPHQyIH63bZxZiqOkvRxUk/GJBs9Up28N7Rc03xCZ5cZOcOnS0xR4gN35cPTEGsOPyQ7ipgT4vWqyLpUKtY1STNJ0gPp2tDRwDvALzIOEZ87OZDnKTUu1DJwc6DjyeKJoYshGWFPrbNVO0zcJUWFZIuLNpygaj4LRacep5T8H0Z4wtC9cIIRmm+kBE8V9CEEbvYnHayasLtJIW6Kr7BXR2hN71Za0Pz3uL2LIeIxEW5T+U4SEtbFgUWVnQqPVtr/6kLROWdJcU+AVj9DJgrcQwPWiTj5lcyZFtNUftxG4z1kU+cg11LKIdv9NCP/MEdHUBnowZ9wi+EsOnHp0/u9kygtElWkKbNOyPIHbGZPnsNT5VtEcHU8zwx+MJ0+jRW7GB6Ep3hQABJoHJpHpzS2H6HcMeOCdPSMkvniPv0GMdKIo0/UUVGR7kddckrHFySiDYBgzvI4g6LXwW6/WHcnpk1M/J2YYvh3Ktf0rxbCeNM8/7XENKQNAMeBSr2xVIC/81DlN9+X2Fyj5JQMC2/nwW3KxFN8ad3gvI57MOa7qDLNtRD+D8+PB+GJLCwZodg List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 4/2/26 06:00, Alexander Lakhin wrote: > Hello Tom and Tomas, > > 01.04.2026 23:20, Tom Lane wrote: >> Alexander Lakhin writes: >>> I think this can explain slow CommitTransactionCommand() and why it >>> happens not every time. Regarding other animals, I guess they can >>> experience the same bumps but not exceeding 5 seconds (50 tries). Thus, >>> from my understanding, for the failure to happen, we need to have slow >>> storage and initialize_worker_spi() -> CommitTransactionCommand() reaching >>> XLogFileClose(). >> So, it remains not very clear why only widowbird is showing this >> failure, but I think we can safely take away the bottom-line >> conclusion that hard-wiring a maximum wait of 5s in >> CountOtherDBBackends() was not a great idea. > > There also were two failures from jay: [1], [2], but yes, widowbird is > getting more and more consistent in that aspect: [3], probably because > of the storage (SD card?) degradation. > Jay is a regular machine, 2-core VM hosted at a university, so not very powerful but was running for years just fine and seems to be healthy. > Tomas, maybe you could check if the write speed is more or less acceptable > there? > Will do. I'll wait for the tests to complete on widowbird, and will do some testing on the storage (it's running from a flash drive, not SD card, and I saw stuff in dmesg when it was dying in the past - but now it's clean). regards -- Tomas Vondra