public inbox for [email protected]
help / color / mirror / Atom feedFrom: Masahiko Sawada <[email protected]>
To: Sami Imseih <[email protected]>
Cc: Daniil Davydov <[email protected]>
Cc: Maxim Orlov <[email protected]>
Cc: Postgres hackers <[email protected]>
Subject: Re: POC: Parallel processing of indexes in autovacuum
Date: Mon, 5 May 2025 22:15:43 -0700
Message-ID: <CAD21AoBgvUeWS8ZsXBahA1XdYayK6DJ6dx49d6Xpii-iH+Hrwg@mail.gmail.com> (raw)
In-Reply-To: <CAA5RZ0vfBg=c_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag@mail.gmail.com>
References: <CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com>
<CAD21AoCdx5ZNS_cO7bYz1Zfb+Kw1kuJV2wtewrz7T1pPpjcWGw@mail.gmail.com>
<CAJDiXgi6ZQOoSEqj9RyZMEh+HHBtmW0+PHD85UNPtKch8ubvdg@mail.gmail.com>
<CAD21AoBcoA-i-pJ_=y+jg14R8_QaJA1iwktCnu5i-C=yXDFPdA@mail.gmail.com>
<CAJDiXgjnUdE6Sk4M0unmT+9dULyFAxcum2txQKpWTuo4uQ_oXQ@mail.gmail.com>
<CAD21AoBTZdVR93JBo620B=MX-K8cdm3VRbjrBr_Vcpngk3AjVw@mail.gmail.com>
<CAA5RZ0vfBg=c_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag@mail.gmail.com>
On Mon, May 5, 2025 at 5:21 PM Sami Imseih <[email protected]> wrote:
>
>
>> On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <[email protected]> wrote:
>> >
>> > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <[email protected]> wrote:
>> > >
>> > > > In current implementation, the leader process sends a signal to the
>> > > > a/v launcher, and the launcher tries to launch all requested workers.
>> > > > But the number of workers never exceeds `autovacuum_max_workers`.
>> > > > Thus, we will never have more a/v workers than in the standard case
>> > > > (without this feature).
>> > >
>> > > I have concerns about this design. When autovacuuming on a single
>> > > table consumes all available autovacuum_max_workers slots with
>> > > parallel vacuum workers, the system becomes incapable of processing
>> > > other tables. This means that when determining the appropriate
>> > > autovacuum_max_workers value, users must consider not only the number
>> > > of tables to be processed concurrently but also the potential number
>> > > of parallel workers that might be launched. I think it would more make
>> > > sense to maintain the existing autovacuum_max_workers parameter while
>> > > introducing a new parameter that would either control the maximum
>> > > number of parallel vacuum workers per autovacuum worker or set a
>> > > system-wide cap on the total number of parallel vacuum workers.
>> > >
>> >
>> > For now we have max_parallel_index_autovac_workers - this GUC limits
>> > the number of parallel a/v workers that can process a single table. I
>> > agree that the scenario you provided is problematic.
>> > The proposal to limit the total number of supportive a/v workers seems
>> > attractive to me (I'll implement it as an experiment).
>> >
>> > It seems to me that this question is becoming a key one. First we need
>> > to determine the role of the user in the whole scheduling mechanism.
>> > Should we allow users to determine priority? Will this priority affect
>> > only within a single vacuuming cycle, or it will be more 'global'?
>> > I guess I don't have enough expertise to determine this alone. I will
>> > be glad to receive any suggestions.
>>
>> What I roughly imagined is that we don't need to change the entire
>> autovacuum scheduling, but would like autovacuum workers to decides
>> whether or not to use parallel vacuum during its vacuum operation
>> based on GUC parameters (having a global effect) or storage parameters
>> (having an effect on the particular table). The criteria of triggering
>> parallel vacuum in autovacuum might need to be somewhat pessimistic so
>> that we don't unnecessarily use parallel vacuum on many tables.
>
>
> Perhaps we should only provide a reloption, therefore only tables specified
> by the user via the reloption can be autovacuumed in parallel?
>
> This gives a targeted approach. Of course if multiple of these allowed tables
> are to be autovacuumed at the same time, some may not get all the workers,
> But that’s not different from if you are to manually vacuum in parallel the tables
> at the same time.
>
> What do you think ?
+1. I think that's a good starting point. We can later introduce a new
GUC parameter that globally controls the maximum number of parallel
vacuum workers used in autovacuum, if necessary.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
view thread (112+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: POC: Parallel processing of indexes in autovacuum
In-Reply-To: <CAD21AoBgvUeWS8ZsXBahA1XdYayK6DJ6dx49d6Xpii-iH+Hrwg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox