public inbox for [email protected]  
help / color / mirror / Atom feed
From: Masahiko Sawada <[email protected]>
To: Daniil Davydov <[email protected]>
Cc: Sami Imseih <[email protected]>
Cc: Alexander Korotkov <[email protected]>
Cc: Matheus Alcantara <[email protected]>
Cc: Maxim Orlov <[email protected]>
Cc: Postgres hackers <[email protected]>
Subject: Re: POC: Parallel processing of indexes in autovacuum
Date: Mon, 5 Jan 2026 10:51:11 -0800
Message-ID: <CAD21AoB7v5tLPXLK=qmtt6PaEC1f+Fb-gh+MwAbXfm6x4eZGNw@mail.gmail.com> (raw)
In-Reply-To: <CAJDiXgjt5ZmK2uvS0E8Ztt5ePYmq8Ze_dG05Zo2NUsKLHCEuYA@mail.gmail.com>
References: <CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com>
	<CAD21AoCdx5ZNS_cO7bYz1Zfb+Kw1kuJV2wtewrz7T1pPpjcWGw@mail.gmail.com>
	<CAJDiXgi6ZQOoSEqj9RyZMEh+HHBtmW0+PHD85UNPtKch8ubvdg@mail.gmail.com>
	<CAD21AoBcoA-i-pJ_=y+jg14R8_QaJA1iwktCnu5i-C=yXDFPdA@mail.gmail.com>
	<CAJDiXgjnUdE6Sk4M0unmT+9dULyFAxcum2txQKpWTuo4uQ_oXQ@mail.gmail.com>
	<CAD21AoBTZdVR93JBo620B=MX-K8cdm3VRbjrBr_Vcpngk3AjVw@mail.gmail.com>
	<CAA5RZ0vfBg=c_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag@mail.gmail.com>
	<CAD21AoBgvUeWS8ZsXBahA1XdYayK6DJ6dx49d6Xpii-iH+Hrwg@mail.gmail.com>
	<CAA5RZ0vF+Lr-jU1LAZWTGUjboUETk8oLvaNBbA5ozX6dau+how@mail.gmail.com>
	<CAJDiXggueLSGMNRmLshbmFRfbo4jzks0W8bLDfUSRZ-61fPVEQ@mail.gmail.com>
	<CAFY6G8cJ=DRTX75pOGerH6sk39dRt+7MSH+y_qppDdhPs=qdQA@mail.gmail.com>
	<CAJDiXgg1t6wk9NjyMUTm1iKqM9GtdQ_wrEchBtz3xjWBZM8W8A@mail.gmail.com>
	<CAD21AoAC0=Xi38RQcAO4A+vdmoXToZMoHfbS=KLT49fAOTH_gA@mail.gmail.com>
	<CAJDiXgiD+AZKhJSn-FSRVQxtDLmJd95wDu4wtKniQF5==1JcjQ@mail.gmail.com>
	<CAD21AoAM8KsqNhrZYJuf7odvxcTC0TumXazJc-r_wC5KnDFDPg@mail.gmail.com>
	<CAJDiXghbcOC9OOj3ampxuyqXH0geggnosnrYUHGygkpss-RtxA@mail.gmail.com>
	<CAD21AoAPnq0vrcGgeN++r1GoL8Kza7jaGL=TNzuBn6+MkR=rUQ@mail.gmail.com>
	<CAJDiXghmsbTmnm--9B5bbuZXa1OL7SZ0HYppX3tx9XsdwfJBhA@mail.gmail.com>
	<[email protected]>
	<CAJDiXgiYiX+azuR76DcVx8fZn57m_4v6cB14-GW34mWa=qudFQ@mail.gmail.com>
	<CAD21AoDtPpkkQ_h1yf4oTx1qn4SRdTeVY3qs+9J07fYqa_4Gww@mail.gmail.com>
	<CAJDiXgi7KB7wSQ=Ux=ngdaCvJnJ5x-ehvTyiuZez+5uKHtV6iQ@mail.gmail.com>
	<CAD21AoCcHKKXsr9Oh736ejckqqS1i430xGEyJ=JP5OL0ExyP1A@mail.gmail.com>
	<CAJDiXghaFT_1sSv3q8mjyZ_RLZDgiogg0mWRvLxSWvkUi2CcLg@mail.gmail.com>
	<CAA5RZ0u63W41OmcEO+HLs4CSo-Sd3J+Q-4=04iud8V=xX4iUrA@mail.gmail.com>
	<CAJDiXgin1TXniVGJKzOTA=F9K342uVfm6O0EmubTVB=F+XSrbA@mail.gmail.com>
	<CAD21AoDadzAwibxf-+urjx=XL+eVu8=Ut-Lh2GxXUt32LbPG3Q@mail.gmail.com>
	<CAD21AoD6HhraqhOgkQJOrr0ixZkAZuqJRpzGv-B+_-ad6d5aPw@mail.gmail.com>
	<CAJDiXgiGSpqMQSOx-cVO_LtcB5GWHBy9ph7oOR4ebbX8A==kgw@mail.gmail.com>
	<CAD21AoBRRXbNJEvCjS-0XZgCEeRBzQPKmrSDjJ3wZ8TN28vaCQ@mail.gmail.com>
	<CAPpHfduBJfMcojvmYHUo8b_C=0cxRy1N+tNiNGoA3RAZq2ApaA@mail.gmail.com>
	<CAD21AoC82NeHKXc965pPUZO2eyo1U7P6cmfRJbrcPDcnd7_6hw@mail.gmail.com>
	<CAJDiXghP2kXnEz+cj3rAWNM3NdKSB_4WtnngFXpVz2omPhGr5A@mail.gmail.com>
	<CAD21AoA0bnRZC_OqKMnH-Ln+OZ9z9k56j2c_MXj8pw69O-wkBw@mail.gmail.com>
	<CAA5RZ0sSXDza7_nUUbhHL_Sws+M+HR1daKJPXHpdLuNCkwUgUg@mail.gmail.com>
	<CAJDiXggrBsbzOisf+Nu8pZkYGrpUZaFbosL1Wbm3kKxzTm4xgw@mail.gmail.com>
	<CAA5RZ0tbiPcgQEjnhdnjz6qSjfRsGrr8jGCaMcrMaoPpax3wig@mail.gmail.com>
	<CAJDiXgjt5ZmK2uvS0E8Ztt5ePYmq8Ze_dG05Zo2NUsKLHCEuYA@mail.gmail.com>

On Sun, Nov 23, 2025 at 7:02 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Sun, Nov 23, 2025 at 5:51 AM Sami Imseih <[email protected]> wrote:
> >
> > > > nworkers has a double meaning. The return value of
> > > > AutoVacuumReserveParallelWorkers
> > > > is nreserved. I think this should be
> > > >
> > > > ```
> > > > nreserved = AutoVacuumReserveParallelWorkers(nworkers);
> > > > ```
> > > >
> > > > and nreserved becomes the authoritative value for the number of parallel
> > > > workers after that point.
> >
> > I could not find this pattern being used in the code base.
> > I think it will be better and more in-line without what we generally do
> > and pass-by-reference and update the value inside
> > AutoVacuumReserveParallelWorkers:
> >
> > ```
> > AutoVacuumReserveParallelWorkers(&nworkers).
> > ```
>
> Maybe I just don't like functions with side effects, but this function will
> have ones anyway. I'll add logic with passing by reference as you
> suggested.
>
> >
> > >> ---
> > >> { name => 'autovacuum_max_parallel_workers', type => 'int', context =>
> > >> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> > >>   short_desc => 'Maximum number of parallel autovacuum workers, that
> > >> can be taken from bgworkers pool.',
> > >>   long_desc => 'This parameter is capped by "max_worker_processes"
> > >> (not by "autovacuum_max_workers"!).',
> > >>   variable => 'autovacuum_max_parallel_workers',
> > >>   boot_val => '0',
> > >>   min => '0',
> > >>   max => 'MAX_BACKENDS',
> > >> },
> > >>
> > >> Parallel vacuum in autovacuum can be used only when users set the
> > >> autovacuum_parallel_workers storage parameter. How about using the
> > >> default value 2 for autovacuum_max_parallel_workers GUC parameter?
> >
> > > Sounds reasonable, +1 for it.
> >
> > v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
> > It should now be 2.
> >
> > +          Sets the maximum number of parallel autovacuum workers that
> > +          can be used for parallel index vacuuming at one time. Is capped by
> > +          <xref linkend="guc-max-worker-processes"/>. The default is 0,
> > +          which means no parallel index vacuuming.
>
> Thanks for noticing it! Fixed.
>
> I am sending an updated set of patches.

Thank you for updating the patch! I've reviewed the 0001 patch and
here are some comments:

---
+        /* Remember how many workers we have reserved. */
+        av_nworkers_reserved += *nworkers;

I think we can simply assign *nworkers to av_nworkers_reserved instead
of incrementing it as we're sure that av_nworkers_reserved is 0 at the
beginning of this function.

---
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+        /* Only leader worker can call this function. */
+        Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+        if (av_nworkers_reserved > 0)
+                AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}

We can put an assertion at the end of the function to verify that this
worker doesn't reserve any worker.

---
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+        if (code != 0)
+                AutoVacuumReleaseAllParallelWorkers();
+}

I think it would be more future-proof if we call
AutoVacuumReleaseAllParallelWorkers() regardless of the code if there
is no strong reason why we check the code there.

---
+        before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);

     /*
      * Create a per-backend PGPROC struct in shared memory.  We must do this
      * before we can use LWLocks or access any shared memory.
      */
     InitProcess();

I think it's better to register the
autovacuum_worker_before_shmem_exit() after the process
initialization. The function could use LWLocks to release the reserved
workers. Given that AutoVacuumReleaseAllParallelWorkers() doesn't try
to release the reserved worker when av_nworkers_reserved == 0, but it
would be more future-proof to do that after the basic process
initialization processes.

How about renaming autovacuum_worker_before_shmem_exit() to
autovacuum_worker_onexit()?

---
IIUC the patch needs to implement some logic to propagate the updates
of vacuum delay parameters to parallel vacuum workers. Are you still
working on it? Or shall I draft this part on top of the 0001 patch?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com






view thread (112+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: POC: Parallel processing of indexes in autovacuum
  In-Reply-To: <CAD21AoB7v5tLPXLK=qmtt6PaEC1f+Fb-gh+MwAbXfm6x4eZGNw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox