MIME-Version: 1.0
References: 
 <CAApHDvq_j+GVqX_ZAmvn236Mgg5OYQ6_s9kVsyoo1tJa2RJ=2w@mail.gmail.com>
 <CAA5RZ0udjEYJupob5tv3286e28bMpajgsy+4nAbxg73YyigZFw@mail.gmail.com>
 <CAApHDvq9ecYSetXUhjMc7E_Jcc7gxxRkA0LosX9PP1-jM_Ak8A@mail.gmail.com>
 <aRNmEPQ18qZlLowV@nathan>
 <CAApHDvrtvMF3_W69hOUr2SnbizjC1jc68_Ca0nPYr=+VUkUkAw@mail.gmail.com>
 <aROY-MUVO_mYTl2f@nathan>
 <CAApHDvpo3YxiaP123vghHL-aLkY2vc-e1scpDSpaTFWz1qVaQA@mail.gmail.com>
 <CABV9wwOcD4RM5Hm0d+=KBK8LyHLtEtF3YZ4ak8V3VvOYV9Z4Tw@mail.gmail.com>
 <aRTpqMleDpoQm9OO@nathan>
 <CA+TgmoY27S+nbgdCrVrc8S4p38NwTAC8_Uyq5ZaX6zxYToebXA@mail.gmail.com>
 <aR9BATffJN-hmQ1w@nathan>
 <CAA5RZ0vJH5k+tWbCG-tLfSCb7=jngwLkQHdwPLo8gP92mg2i_g@mail.gmail.com>
 <CA+TgmobEGG4pUzLMY-02N_FEY8uW4520X3rTB+BdVNUkLzDLMQ@mail.gmail.com>
 <CAApHDvqK=dUa35oZjG8kh+A-aPof2pfsNmb-WXTMVHduKpm6bQ@mail.gmail.com>
 <CA+TgmoYCLqE-1vX9uhryF7NhOAT0v6+RP8E303u6rRgpFUiWyg@mail.gmail.com>
 <CAApHDvqFyEdWyEDT7NCKcqEi2sdchhOsA-+yWv_Zk=dRek4kgg@mail.gmail.com>
 <CA+TgmoZE=-Pw=85u+eX6UpMdY02o1otEhQba0n2T78yVyV2N9g@mail.gmail.com>
In-Reply-To: 
 <CA+TgmoZE=-Pw=85u+eX6UpMdY02o1otEhQba0n2T78yVyV2N9g@mail.gmail.com>
From: Sami Imseih <samimseih@gmail.com>
Date: Sat, 22 Nov 2025 11:28:10 -0600
Message-ID: 
 <CAA5RZ0vzu4vfRP0podGDCBOn-OaK9Qs-oyksER5OtzanacqMQw@mail.gmail.com>
Subject: Re: another autovacuum scheduling thread
To: Robert Haas <robertmhaas@gmail.com>
Cc: David Rowley <dgrowleyml@gmail.com>,
 Nathan Bossart <nathandbossart@gmail.com>,
	Robert Treat <rob@xzilla.net>, Jeremy Schneider <schneider@ardentperf.com>,
 pgsql-hackers@postgresql.org
Content-Type: text/plain; charset="UTF-8"
Archived-At: 
 <https://www.postgresql.org/message-id/CAA5RZ0vzu4vfRP0podGDCBOn-OaK9Qs-oyksER5OtzanacqMQw%40mail.gmail.com>
Precedence: bulk

> > I suspect the most likely area the new prioritisation order could
> > cause issues is from the lack of randomness. Will multiple workers
> > working into the same database be more likely to bump into each other
> > somehow in a bad way? Maybe that's a good area to focus testing.
>
> I agree that lack of randomness could cause problems, but I don't see
> how it could cause regressions, because the current system isn't
> random, either. Even if the order of pg_class is unpredictable, it may
> (depending on the workload) not change very much from one day to the
> next.
>
> > Yeah partly, but mostly I just really doubt that this matters that
> > much. It's been said on this thread already that prioritisation isn't
> > as important as the autovacuum-configured-to-run-too-slowly issue, and
> > I agree with that. I just find it hard to believe that the highly
> > volatile pg_class order has been just perfect all these years and that
> > sorting by percentage-over-threshold-desc will make things worse
> > overall. There was mention that pg_catalog tables are first in
> > pg_class, but I don't really agree with that as if I create some new
> > tables on a fresh database, I see those getting lower ctids than any
> > pg_catalog table. The space for that is finite, but there's no
> > shortage of other reasons for user tables to become mentioned in
> > pg_class before catalogue tables as the database gets used. I see that
> > table_beginscan_catalog() uses SO_ALLOW_SYNC too, so there's an extra
> > layer of randomness from sync scans. I don't recall any complaints
> > from the order autovacuum works on tables, so, to me, it just seems
> > strange to think that the volatile order of pg_class just happened to
> > be right all these years. I suspect what's happening is that the extra
> > bloat or stale statistics that people get as a result of the
> > pg_class-order autovacuum just gets unnoticed, ignored or attended to
> > via adjustments to the corresponding scale_factor reloption.
>
> Interesting. I don't have any real knowledge of how jumbled-up the
> order of pg_class is on real production systems, and I agree that if
> the answer is "it's usually quite jumbled up" then that is good news
> for this patch. In any case, I'm not trying to say that prioritization
> is an intrinsically bad idea, because I don't believe that. What I'm
> trying to say is that there's a limited number of ways for this patch
> to make things worse

What I have not been able to prove from my tests is that the processing
order of tables by autovacuum will actually make things any better or any
worse. My tests have been short 30 minute tests that count how many
vacuum cycles tables with various DML activity and sizes received.
I have not found much difference. I am also not sure  how valuable
these short-duration tests are either.

On the field is where the real test occurs and it may be discovered that
the new strategy improves the majority of the cases, and there may also
be cases where the existing strategy is somehow better. Having the
ability to go back to the existing behavior seems like the best way we
can roll this out and learn over time.

These may be the only two strategies we will ever need, or we may find out
that a third strategy in which individual tables are assigned a prioritization
score will also be useful.

--
Sami