Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v6tIn-00BPGd-EV for pgsql-hackers@arkaria.postgresql.org; Thu, 09 Oct 2025 16:15:37 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v6tIk-007Dyg-Fx for pgsql-hackers@arkaria.postgresql.org; Thu, 09 Oct 2025 16:15:35 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v6tIj-007DyY-Oz for pgsql-hackers@lists.postgresql.org; Thu, 09 Oct 2025 16:15:35 +0000 Received: from fhigh-a3-smtp.messagingengine.com ([103.168.172.154]) by makus.postgresql.org with smtp (Exim 4.96) (envelope-from ) id 1v6tIi-000t7h-07 for pgsql-hackers@postgresql.org; Thu, 09 Oct 2025 16:15:33 +0000 Received: from phl-compute-04.internal (phl-compute-04.internal [10.202.2.44]) by mailfhigh.phl.internal (Postfix) with ESMTP id 7DE7714000BD; Thu, 9 Oct 2025 12:15:32 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-04.internal (MEProxy); Thu, 09 Oct 2025 12:15:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1760026532; x=1760112932; bh=lIRw7/D4YD HTNK1bUf5TEw1tiUSwm7nCjkTseqLMY/k=; b=fzZjlaPpEOU0uQMVoat0W4897/ R0zDuE1NIdAgKTVLMnF2d+sKsseUxftz+26c+KuJPopU5AuikKY0F4KrZ90181hj Zztj8Li/hSNBhaKFMivyl32/uRAR5TN5u8LH89yzag7N1QO6mmAKlb4sPQKJMpOt EKh26ruJ+s8zYN+aS8HSNvdrWKweyidQzOnKqZiDBT31W+L0xnVOume3ul7y4J8x cex+Kb4dsmUh81mWFl3vUPjnyKt8Y1MB+3jCr4KRMzeh42U+l8q2qO291SkxBz7+ TjBDtWjc39nCuw3IpqkSO0+QKl8OnuFxZJrGOwrvz/Jnd4V0GeRJmntm+Gdg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1760026532; x=1760112932; bh=lIRw7/D4YDHTNK1bUf5TEw1tiUSwm7nCjkT seqLMY/k=; b=DK0n8hLRMRr349zHO3vRgAZWfnbAzEIeiWw0BYWop8H80Y1Y5yY wRxtA3yvGMtULueKgBXOKCesjTaucUWw3EWU1eRcM4g6SPO2n0tXu/bABR6xI2cv OYb788OBU1UwFKSF7mlewRoM6PrnG1a0LHgJTAhQFwb5zE/ON/+/mwI61v2t51TN Htl8zHodkPaTtU/dUAdrryvWFNAF/+n7bOaqWqYgFVE19YLHBzNKRu+DhLu7H8rM sbxloIFVAqcqBzZx7xNL5uAaOpDT52wfHwq95kRhtbw8rHK6v9Qi8fuQUJhezLoU 94peqFd0RpZ0B6M8kWP7WDGBLgUjRvguf3w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggddutdeiieefucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggujgesthdtsfdttddtvdenucfhrhhomheptehnughrvghs ucfhrhgvuhhnugcuoegrnhgurhgvshesrghnrghrrgiivghlrdguvgeqnecuggftrfgrth htvghrnhepfeffgfelvdffgedtveelgfdtgefghfdvkefggeetieevjeekteduleevjefh ueegnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprg hnughrvghssegrnhgrrhgriigvlhdruggvpdhnsggprhgtphhtthhopedvpdhmohguvgep shhmthhpohhuthdprhgtphhtthhopehnrghthhgrnhgusghoshhsrghrthesghhmrghilh drtghomhdprhgtphhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehpohhsthhgrhgvshhq lhdrohhrgh X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 9 Oct 2025 12:15:32 -0400 (EDT) Date: Thu, 9 Oct 2025 12:15:31 -0400 From: Andres Freund To: Nathan Bossart Cc: pgsql-hackers@postgresql.org Subject: Re: another autovacuum scheduling thread Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2025-10-09 11:01:16 -0500, Nathan Bossart wrote: > On Wed, Oct 08, 2025 at 01:37:22PM -0400, Andres Freund wrote: > > On 2025-10-08 10:18:17 -0500, Nathan Bossart wrote: > >> The attached patch works by storing the maximum of the XID age and the MXID > >> age in the list with the OIDs and sorting it prior to processing. > > > > I think it may be worth trying to avoid reliably using the same order - > > otherwise e.g. a corrupt index on the first scheduled table can cause > > autovacuum to reliably fail on the same relation, never allowing it to > > progress past that point. > > Hm. What if we kept a short array of "failed" tables in shared memory? I've thought about having that as part of pgstats... > Each worker would consult this table before processing. If the table is > there, it would remove it from the shared table and skip processing it. > Then the next worker would try processing the table again. > > I also wonder how hard it would be to gracefully catch the error and let > the worker continue with the rest of its list... The main set of cases I've seen are when workers get hung up permanently in corrupt indexes. There never is actually an error, the autovacuums just get terminated as part of whatever independent reason there is to restart. The problem with that is that you'll never actually have vacuum fail... Greetings, Andres Freund