public inbox for [email protected]
help / color / mirror / Atom feedFrom: Daniil Davydov <[email protected]>
To: Masahiko Sawada <[email protected]>
Cc: Maxim Orlov <[email protected]>
Cc: Postgres hackers <[email protected]>
Subject: Re: POC: Parallel processing of indexes in autovacuum
Date: Sat, 3 May 2025 01:12:49 +0700
Message-ID: <CAJDiXgi6ZQOoSEqj9RyZMEh+HHBtmW0+PHD85UNPtKch8ubvdg@mail.gmail.com> (raw)
In-Reply-To: <CAD21AoCdx5ZNS_cO7bYz1Zfb+Kw1kuJV2wtewrz7T1pPpjcWGw@mail.gmail.com>
References: <CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com>
<CAD21AoCdx5ZNS_cO7bYz1Zfb+Kw1kuJV2wtewrz7T1pPpjcWGw@mail.gmail.com>
On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <[email protected]> wrote:
>
> As I understand it, we initially disabled parallel vacuum for
> autovacuum because their objectives are somewhat contradictory.
> Parallel vacuum aims to accelerate the process by utilizing additional
> resources, while autovacuum is designed to perform cleaning operations
> with minimal impact on foreground transaction processing (e.g.,
> through vacuum delay).
>
Yep, we also decided that we must not create more a/v workers for
index processing.
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).
> Nevertheless, I see your point about the potential benefits of using
> parallel vacuum within autovacuum in specific scenarios. The crucial
> consideration is determining appropriate criteria for triggering
> parallel vacuum in autovacuum. Given that we currently support only
> parallel index processing, suitable candidates might be autovacuum
> operations on large tables that have a substantial number of
> sufficiently large indexes and a high volume of garbage tuples.
>
> Although the actual number of parallel workers ultimately depends on
> the number of eligible indexes, it might be beneficial to introduce a
> storage parameter, say parallel_vacuum_workers, that allows control
> over the number of parallel vacuum workers on a per-table basis.
>
For now, we have three GUC variables for this purpose:
max_parallel_index_autovac_workers, autovac_idx_parallel_min_rows,
autovac_idx_parallel_min_indexes.
That is, everything is as you said. But we are still conducting
research on this issue. I would like to get rid of some of these
parameters.
> Regarding implementation: I notice the WIP patch implements its own
> parallel vacuum mechanism for autovacuum. Have you considered simply
> setting at_params.nworkers to a value greater than zero?
>
About `at_params.nworkers = N` - that's exactly what we're doing (you
can see it in the `vacuum_rel` function). But we cannot fully reuse
code of VACUUM PARALLEL, because it creates its own processes via
dynamic bgworkers machinery.
As I said above - we don't want to consume additional resources. Also
we don't want to complicate communication between processes (the idea
is that a/v workers can only send signals to the a/v launcher).
As a result, we created our own implementation of parallel index
processing control - see changes in vacuumparallel.c and autovacuum.c.
--
Best regards,
Daniil Davydov
view thread (112+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: POC: Parallel processing of indexes in autovacuum
In-Reply-To: <CAJDiXgi6ZQOoSEqj9RyZMEh+HHBtmW0+PHD85UNPtKch8ubvdg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox