public inbox for [email protected]  
help / color / mirror / Atom feed
From: Peter Geoghegan <[email protected]>
To: Andres Freund <[email protected]>
Cc: Nathan Bossart <[email protected]>
Cc: [email protected]
Subject: Re: another autovacuum scheduling thread
Date: Thu, 9 Oct 2025 15:45:32 -0400
Message-ID: <CAH2-Wz=vmdzeUE7PH9b0igFpJqDKY63icMWmzN=sLiyKVxyqOA@mail.gmail.com> (raw)
In-Reply-To: <o33hdbfnosn7pw5e3a34jdtfoaxih6vwbe6rf7bo6ocbn4zv4l@incbnjupgq4o>
References: <aOaAuXREwnPZVISO@nathan>
	<l7k5nkow2n4x2lodcjimxl4wqv7rdjduo3zuzjwlx3kjxty5q2@gzl4pqbm6ows>
	<aOfcTD3T7F3dydg8@nathan>
	<o33hdbfnosn7pw5e3a34jdtfoaxih6vwbe6rf7bo6ocbn4zv4l@incbnjupgq4o>

On Thu, Oct 9, 2025 at 12:15 PM Andres Freund <[email protected]> wrote:
> > Each worker would consult this table before processing.  If the table is
> > there, it would remove it from the shared table and skip processing it.
> > Then the next worker would try processing the table again.
> >
> > I also wonder how hard it would be to gracefully catch the error and let
> > the worker continue with the rest of its list...
>
> The main set of cases I've seen are when workers get hung up permanently in
> corrupt indexes.

How recently was this? I'm aware of problems like that that we
discussed around 2018, but they were greatly mitigated.
First by your commit 3a01f68e, then by my commit c34787f9.

In general, there's no particularly good reason why (at least with
nbtree indexes) VACUUM should ever hang forever. The access pattern is
overwhelmingly simple, sequential access. The only exception is nbtree
page deletion (plus backtracking), where it isn't particularly hard to
just be very careful about self-deadlock.

> There never is actually an error, the autovacuums just get
> terminated as part of whatever independent reason there is to restart.

What do you mean?

In general I'd expect nbtree VACUUM of a corrupt index to either not
fail at all (we'll soldier on to the best of our ability when page
deletion encounters an inconsistency), or to get permanently stuck due
to locking the same page twice/self-deadlock (though as I said, those
problems were mitigated, and might even be almost impossible these
days). Every other case involves some kind of error (e.g., an OOM is
just about possible).

I agree with you about using a perfectly deterministic order coming
with real downsides, without any upside. Don't interpret what I've
said as expressing opposition to that idea.


--
Peter Geoghegan





view thread (143+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: another autovacuum scheduling thread
  In-Reply-To: <CAH2-Wz=vmdzeUE7PH9b0igFpJqDKY63icMWmzN=sLiyKVxyqOA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox