Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v8Hiy-00FiHP-BZ for pgsql-hackers@arkaria.postgresql.org; Mon, 13 Oct 2025 12:32:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v8Hiw-005kt5-1y for pgsql-hackers@arkaria.postgresql.org; Mon, 13 Oct 2025 12:32:22 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v8Hiv-005ksx-Ma for pgsql-hackers@lists.postgresql.org; Mon, 13 Oct 2025 12:32:22 +0000 Received: from mail-ej1-x62d.google.com ([2a00:1450:4864:20::62d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v8Hiu-001Wb7-2v for pgsql-hackers@postgresql.org; Mon, 13 Oct 2025 12:32:21 +0000 Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-b4539dddd99so847527666b.1 for ; Mon, 13 Oct 2025 05:32:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1760358739; x=1760963539; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=O5IzHfLN8EqbEP9VFytGoHjM+h72S+3VLViyjmg48dU=; b=VHjSqUM2I4Frrn69JkSiRvmj98kj1oTugpyn5TW/DXIcOjuBgSIQHU6C/x9e1YxgJ2 Y2n+j5TZMtCuxk0HAY58YSYCw2FSNLy/vbJtz9ut5aPO0kk9WT6FPCe4Ua4IfXL7pMxj DtA06KLFtSzauL+o6LwFtokSQub661TFqZFG9Iao3euypf3Qd6dU3P3tNHbp4W93mxOE wTewranE9sHRVaFWMtUbwSnf1NandMm4kqvmJgEeXaLqu4cczek7q2MPkZWbGDCvsqJ2 1658jaByppgODcE8O1eahKZvBbYyaETWqxB0I7bDUA/0ghG8f3yOzu5FYN2CsM/fjqZ2 ioOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760358739; x=1760963539; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=O5IzHfLN8EqbEP9VFytGoHjM+h72S+3VLViyjmg48dU=; b=RM7xrqTTTpGmdtPK5g7Oe49nJG00g5f9h//yg5UAU4zIbIeuiIYtXMHklDfhMYmUxB U6kiwbvCfNIzSEYMSquaQgDEdZt+iFhprugsGDvOZwPjr7utR3zxz1D2E3RgAfMt7Ubx fcBK1XUxsgTjvogJQ+AvkPIqk/ojeHe/l2gfQiELXOg3d10zFHTCvfZyAwV2sNdex52i 2/85rQa53cdaRUYG8cNMzxQLD3NC501LRFoxVvww7thHZHBl3AIeMpe/jkoHVvMkgGdO 51My8xsTMtogRHY3WKdE3fkUtyB/EmXhJd+0iAtHTWSGqh917YQIYkqc52dTkDdInmiq 7myw== X-Forwarded-Encrypted: i=1; AJvYcCXD3O49ercj6SYaABLHaFIJx9zPw17OCj6exmg6XSpYvrwW2CH4W5d154x8+BqBTXQcEXWdgvKJkpkuCfGU@postgresql.org X-Gm-Message-State: AOJu0YwzlgVhbT2FBfx/GebOb9y56coa1xNGDskoXHOCJTiG2g2hsxJg k0og/unoRz61seoHAtNjpQaVs2cdYoJxRKRVxQgEK/D4ZBjPrlfmRWAC3k4iMRO5Y3KnEowZ8yC 2XXm7b8ijkEO2LxU6i4k7nNehj8Zok1I= X-Gm-Gg: ASbGncu24vMz67i5H06O1feBj8keywRvMmAjHnbxElF8t8HM6e16JeoZmzEXNmUibSk JqIRusFCLcXK1giTZiRLx40lDoCRm/FzhZ1X4wTWPUauLfS7W+XlR2xUwDwdkMp2yVXkd2eyzW/ Wervn4hn3XkjPQ8VouNkVo8kGXdrHXwq6G6YQPthLGBeZClSAFDWKF3tNFBAKF437aOipI9e/+/ b5pWi5NESanIHZ9r2FUg3eEf8ezwUlnj4Wc X-Google-Smtp-Source: AGHT+IHA9FM/+nhWKSM9A7x9a5ftjtTShzXO4xYrEzsf1iZd0HFrU/spf66Cyway7LBxeTDnRtbwQy8JSmHWMuH1sUw= X-Received: by 2002:a17:907:807:b0:b41:f155:404b with SMTP id a640c23a62f3a-b50aa38737bmr2437611866b.17.1760358739159; Mon, 13 Oct 2025 05:32:19 -0700 (PDT) MIME-Version: 1.0 References: <20251008164057.6bceb9ed@ardentperf.com> <20251008172727.3befd129@ardentperf.com> <20251008182520.6e05a8b8@ardentperf.com> <20251008184740.328d45de@ardentperf.com> <20251010145959.414a2c27@ardentperf.com> In-Reply-To: <20251010145959.414a2c27@ardentperf.com> From: Robert Haas Date: Mon, 13 Oct 2025 08:32:07 -0400 X-Gm-Features: AS18NWCooTDaaBd7ulaEPxJpKqijAujbMNapvPH-FRjxfzQdstgsQslnPrUNGcQ Message-ID: Subject: Re: another autovacuum scheduling thread To: Jeremy Schneider Cc: Nathan Bossart , David Rowley , Sami Imseih , pgsql-hackers@postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Oct 10, 2025 at 6:00=E2=80=AFPM Jeremy Schneider wrote: > The spectacular failures I've seen with autovac usually come down to > things like too much sleeping (cost_delay) or too few workers, where > better ordering would be nice but probably wouldn't fix any real > problems leading to the spectacular failures Since I have said the same thing myself, I can hardly disagree. However, there are probably a few exceptions. For instance, if autovacuum on a certain table is failing repeatedly or accomplishing nothing without removing the apparent need to autovacuum, and happens to be the first one in pg_class, it could divert a lot of attention from other tables. > Robert it sounds to me like the main use case you're focused on here > is where basically wraparound is imminent - we are already screwed - and > our very last hope was that a last-ditch autovac can finish just in time Yes, I would argue that this is the scenario that really matters. As you say above, the main thing is having little enough sleeping and a sufficient number of workers. When that's the case, we can do the work in any order and life will mostly be fine. However, if we get into a desperate situation by, say, having one table that can't be vacuumed, and eventually someone fixes that, say by dropping the corrupt index that is preventing vacuuming of that table, we might like it if autovacuum focused on getting that table vacuumed rather than getting lost in the sauce. Of course, if we have the pretty common situation where autovacuum gets behind on all tables, say due to a stale replication slot, then this is less critical, although a perfect system would probably prioritize vacuuming the *largest* tables in this situation, since those will take the longest to finish, and it's when a vacuum of every table in the cluster has been *completed* that the XID horizons can advance. > I hope y'all just pick something and commit it without getting too lost > in the details. I honestly think in the list of improvements around > autovac, this is the lowest priority on my list of hopes and dreams as a > user for wraparound prevention :) because if this ever matters to me for > avoiding wraparound, I was screwed long before we got to this point and > this is not going to fix my underlying problems. I'm not sure if this was your intention, but to me this kind of reads like "well, it's not going to matter anyway so just do whatever and move on" and I don't agree with that. I think that if we're not going to do high-quality engineering here, we just shouldn't change anything at all. It's better to keep having the same bad behavior than for each release to have new and different bad behavior. One possible positive result of leaning into this prioritization problem is that whoever's working in it (Nathan, in this case) might gain some useful insights about how to tackle some of the other problems in this space. All of this is hard enough that we haven't really had any major improvements in this area since, I want to say, 8.3, and it's desirable to break that logjam even if we don't all agree on which problems are most urgent. Even if I ultimately don't agree with whatever Nathan wants to do or proposes, I'm glad he's trying to do something, which is (in my experience) generally much better than making no effort at all. --=20 Robert Haas EDB: http://www.enterprisedb.com