MIME-Version: 1.0
References: 
 <CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com>
 <CAD21AoCdx5ZNS_cO7bYz1Zfb+Kw1kuJV2wtewrz7T1pPpjcWGw@mail.gmail.com>
 <CAJDiXgi6ZQOoSEqj9RyZMEh+HHBtmW0+PHD85UNPtKch8ubvdg@mail.gmail.com>
 <CAD21AoBcoA-i-pJ_=y+jg14R8_QaJA1iwktCnu5i-C=yXDFPdA@mail.gmail.com>
 <CAJDiXgjnUdE6Sk4M0unmT+9dULyFAxcum2txQKpWTuo4uQ_oXQ@mail.gmail.com>
 <CAD21AoBTZdVR93JBo620B=MX-K8cdm3VRbjrBr_Vcpngk3AjVw@mail.gmail.com>
 <CAA5RZ0vfBg=c_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag@mail.gmail.com>
 <CAD21AoBgvUeWS8ZsXBahA1XdYayK6DJ6dx49d6Xpii-iH+Hrwg@mail.gmail.com>
 <CAA5RZ0vF+Lr-jU1LAZWTGUjboUETk8oLvaNBbA5ozX6dau+how@mail.gmail.com>
 <CAJDiXggueLSGMNRmLshbmFRfbo4jzks0W8bLDfUSRZ-61fPVEQ@mail.gmail.com>
 <CAFY6G8cJ=DRTX75pOGerH6sk39dRt+7MSH+y_qppDdhPs=qdQA@mail.gmail.com>
 <CAJDiXgg1t6wk9NjyMUTm1iKqM9GtdQ_wrEchBtz3xjWBZM8W8A@mail.gmail.com>
 <CAD21AoAC0=Xi38RQcAO4A+vdmoXToZMoHfbS=KLT49fAOTH_gA@mail.gmail.com>
 <CAJDiXgiD+AZKhJSn-FSRVQxtDLmJd95wDu4wtKniQF5==1JcjQ@mail.gmail.com>
 <CAD21AoAM8KsqNhrZYJuf7odvxcTC0TumXazJc-r_wC5KnDFDPg@mail.gmail.com>
 <CAJDiXghbcOC9OOj3ampxuyqXH0geggnosnrYUHGygkpss-RtxA@mail.gmail.com>
 <CAD21AoAPnq0vrcGgeN++r1GoL8Kza7jaGL=TNzuBn6+MkR=rUQ@mail.gmail.com>
 <CAJDiXghmsbTmnm--9B5bbuZXa1OL7SZ0HYppX3tx9XsdwfJBhA@mail.gmail.com>
 <DB3C67FCRLOO.1R5NLYCNEA6BF@gmail.com>
 <CAJDiXgiYiX+azuR76DcVx8fZn57m_4v6cB14-GW34mWa=qudFQ@mail.gmail.com>
 <CAD21AoDtPpkkQ_h1yf4oTx1qn4SRdTeVY3qs+9J07fYqa_4Gww@mail.gmail.com>
 <CAJDiXgi7KB7wSQ=Ux=ngdaCvJnJ5x-ehvTyiuZez+5uKHtV6iQ@mail.gmail.com>
In-Reply-To: 
 <CAJDiXgi7KB7wSQ=Ux=ngdaCvJnJ5x-ehvTyiuZez+5uKHtV6iQ@mail.gmail.com>
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 17 Jul 2025 12:42:27 -0700
Message-ID: 
 <CAD21AoCcHKKXsr9Oh736ejckqqS1i430xGEyJ=JP5OL0ExyP1A@mail.gmail.com>
Subject: Re: POC: Parallel processing of indexes in autovacuum
To: Daniil Davydov <3danissimo@gmail.com>
Cc: Matheus Alcantara <matheusssilv97@gmail.com>,
 Sami Imseih <samimseih@gmail.com>,
	Maxim Orlov <orlovmg@gmail.com>,
 Postgres hackers <pgsql-hackers@lists.postgresql.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: 
 <https://www.postgresql.org/message-id/CAD21AoCcHKKXsr9Oh736ejckqqS1i430xGEyJ%3DJP5OL0ExyP1A%40mail.gmail.com>
Precedence: bulk

On Mon, Jul 14, 2025 at 3:49=E2=80=AFAM Daniil Davydov <3danissimo@gmail.co=
m> wrote:
>
>
> > ---
> > +   nlaunched_workers =3D pvs->pcxt->nworkers_launched; /* remember thi=
s value */
> >     DestroyParallelContext(pvs->pcxt);
> > +
> > +   /* Release all launched (i.e. reserved) parallel autovacuum workers=
. */
> > +   if (AmAutoVacuumWorkerProcess())
> > +       ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
> > +
> >
> > Why don't we release workers before destroying the parallel context?
> >
>
> Destroying parallel context includes waiting for all workers to exit (aft=
er
> which, other operations can use them).
> If we first call ParallelAutoVacuumReleaseWorkers, some operation can
> reasonably request all released workers. But this request can fail,
> because there is no guarantee that workers managed to finish.
>
> Actually, there's nothing wrong with that, but I think releasing workers
> only after finishing work is a more logical approach.
>


> > ---
> > @@ -706,16 +751,16 @@
> > parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int
> > num_index_scan
> >
> >         if (vacuum)
> >             ereport(pvs->shared->elevel,
> > -                   (errmsg(ngettext("launched %d parallel vacuum
> > worker for index vacuuming (planned: %d)",
> > -                                    "launched %d parallel vacuum
> > workers for index vacuuming (planned: %d)",
> > +                   (errmsg(ngettext("launched %d parallel %svacuum
> > worker for index vacuuming (planned: %d)",
> > +                                    "launched %d parallel %svacuum
> > workers for index vacuuming (planned: %d)",
> >                                      pvs->pcxt->nworkers_launched),
> > -                           pvs->pcxt->nworkers_launched, nworkers)));
> > +                           pvs->pcxt->nworkers_launched,
> > AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
> >
> > The "%svacuum" part doesn't work in terms of translation. We need to
> > construct the whole sentence instead.
> > But do we need this log message
> > change in the first place? IIUC autovacuums write logs only when the
> > execution time exceed the log_autovacuum_min_duration (or its
> > reloption). The patch unconditionally sets LOG level for autovacuums
> > but I'm not sure it's consistent with other autovacuum logging
> > behavior:
> >
> > +   int         elevel =3D AmAutoVacuumWorkerProcess() ||
> > +       vacrel->verbose ?
> > +       INFO : DEBUG2;
> >
> >
>
> This log level is used only "for messages about parallel workers launched=
".
> I think that such logs relate more to the parallel workers module than
> autovacuum itself. Moreover, if we emit log "planned vs. launched" each
> time, it will simplify the task of selecting the optimal value of
> 'autovacuum_max_parallel_workers' parameter. What do you think?

INFO level is normally not sent to the server log. And regarding
autovacuums, they don't write any log mentioning it started. If we
want to write planned vs. launched I think it's better to gather these
statistics during execution and write it together with other existing
logs.

>
> About "%svacuum" - I guess we need to clarify what exactly the workers
> were launched for. I'll add errhint to this log, but I don't know whether=
 such
> approach is acceptable.

I'm not sure errhint is an appropriate place. If we write such
information together with other existing autovacuum logs as I
suggested above, I think we don't need to add such information to this
log message.

I've reviewed v7 patch and here are some comments:

+   {
+       {
+           "parallel_autovacuum_workers",
+           "Maximum number of parallel autovacuum workers that can be
taken from bgworkers pool for processing this table. "
+           "If value is 0 then parallel degree will computed based on
number of indexes.",
+           RELOPT_KIND_HEAP,
+           ShareUpdateExclusiveLock
+       },
+       -1, -1, 1024
+   },

Many autovacuum related reloptions have the prefix "autovacuum". So
how about renaming it to autovacuum_parallel_worker (change
check_parallel_av_gucs() name too accordingly)?

---
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+                                     GucSource source)
+{
+   if (*newval >=3D max_worker_processes)
+       return false;
+   return true;
+}

I think we don't need to strictly check the
autovacuum_max_parallel_workers value. Instead, we can accept any
integer value but internally cap by max_worker_processes.

---
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{

I think this function doesn't just check the value but does adjust the
number of available workers, so how about
adjust_free_parallel_workers() or something along these lines?

---
+       /*
+        * Number of available workers must not exceed limit.
+        *
+        * Note, that if some parallel autovacuum workers are running at th=
is
+        * moment, available workers number will not exceed limit after
+        * releasing them (see ParallelAutoVacuumReleaseWorkers).
+       */
+       AutoVacuumShmem->av_freeParallelWorkers =3D
+           autovacuum_max_parallel_workers;

I think the comment refers to the following code in
AutoVacuumReleaseParallelWorkers():

+   /*
+    * If autovacuum_max_parallel_workers variable was reduced during paral=
lel
+    * autovacuum execution, we must cap available workers number by its ne=
w
+    * value.
+    */
+   if (AutoVacuumShmem->av_freeParallelWorkers >
+       autovacuum_max_parallel_workers)
+   {
+       AutoVacuumShmem->av_freeParallelWorkers =3D
+           autovacuum_max_parallel_workers;
+   }

After the autovacuum launchers decreases av_freeParallelWorkers, it's
not guaranteed that the autovacuum worker already reflects the new
value from the config file when executing the
AutoVacuumReleaseParallelWorkers(), which leds to skips the above
codes. For example, suppose that autovacuum_max_parallel_workers is 10
and 3 parallel workers are running by one autovacuum worker (i.e.,
av_freeParallelWorkers =3D 7 now), if the user changes
autovacuum_max_parallel_workers to 5, the autovacuum launchers adjust
av_freeParallelWorkers to 5. However, if the worker doesn't reload the
config file and executes AutoVacuumReleaseParallelWorkers(), it
increases av_freeParallelWorkers to 8 and skips the adjusting logic.
I've not tested this scenarios so I might be missing something though.

Regards,

--=20
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com