Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uC63n-00ANgh-Sq for pgsql-hackers@arkaria.postgresql.org; Tue, 06 May 2025 00:21:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uC63m-0068mu-Cx for pgsql-hackers@arkaria.postgresql.org; Tue, 06 May 2025 00:21:22 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uC63l-0068mm-WA for pgsql-hackers@lists.postgresql.org; Tue, 06 May 2025 00:21:22 +0000 Received: from mail-lj1-x22d.google.com ([2a00:1450:4864:20::22d]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uC63i-000ML6-2e for pgsql-hackers@lists.postgresql.org; Tue, 06 May 2025 00:21:21 +0000 Received: by mail-lj1-x22d.google.com with SMTP id 38308e7fff4ca-30bf5d7d107so40614521fa.2 for ; Mon, 05 May 2025 17:21:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746490878; x=1747095678; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=EPbhxXZoXiscgnl971Ig+0lknTGanQYk4b+yVL7imi8=; b=G7d0zTwwYxaU4D3blpyuSkcBUcxqCmubTxJLpSrlGyMIDdNPr1hXPObo5I8vvX2Hgh DCoInU6m8f47/FPPODX8Sqgwo8rWHhOdWJCU3Y9zy+e8EUQvspOn9ILlbbG3ZOHveJlz Ww73pp8ChANQhpWMFS+qoxojxaFyrUUGpkSAiuQMr5QQ+kyf7jyCyHDWvBeral515cxK /RmWKDmCuQHtYvTlQ0ZcYlbilCkh8HjlVhpnhMICMjNIHsHHcG9BPzDmmkADYR/Hol1r orYIlpGgON1lX+kbt9v+ntyZMR+zyOOFrwcnIAX/LQchUEHlD3BgzCQQpF9vKCis+4Ye Yb9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746490878; x=1747095678; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EPbhxXZoXiscgnl971Ig+0lknTGanQYk4b+yVL7imi8=; b=C1thQucdDDpEtYTNVoGhgynZMzk70zuQHn84ZSvrcYPd7JkivSehIwIvoMciaphMOQ G8/8L/sPx0d5FnPBsZ8kpTUkup7KJ1sHwad4cZ/WL5dwBEfQy6rb/lKvMRzcWneX0FkQ c4FGZx+1xRr6xqmj6tHj8k8dI6hkBMxS3E6v3QKKZFDoLl1Ktyw0yRyYN9BKEpAH3j5l 3NNgdbK57tQKM07B6zMGQzt7I5x5Aqcaxnseo9As1qMmG8/Bsi3zzBBshEl0YHakQXPL mCq85uINRqt/FDvmFKrg5CJCZt+C4Z2b/kpY5dIDpmyoPJOmTGoyGBpRfNMPIoahY5r6 SLqw== X-Forwarded-Encrypted: i=1; AJvYcCXbkuYOwH6JtyYApXUP2TLITMXdkbOiVwU2TV1yvDBTHQrdZgzFvSZuOltee3W4TYbwL9Lg1oL4sH+wIhTw@lists.postgresql.org X-Gm-Message-State: AOJu0Yyew2YxzFDmjn88SOgL+IfQqM/Zv1r6f4Riw96rICNCg+EG9EDT /gmvy5365+yt3JQl6zknhKmCa7zfUnHWao/wZJUcaLWCtb7ugHP8b0vMKWQ/IDaZm/aU2xGu/Gn LTgOpfQpdVjLnplU/YQhRs5lQpkk= X-Gm-Gg: ASbGncuPBIUau+aogZ4YNYZ1UrknNsDa3b8nfJiJ+jhYWqw33Zy6IWncqntBUrLBz4u 8aFORZ+4rxu9bv+EoG8I/RhkZP4pJF8sGDFZ85dzT3n9tmqqLnhs1X78m+gYbbtQ9ZTQLEKj+pA Bn1upaeBU//pw145heWOXCNosBFpSkOg8miuaDM2II8njj5UFiEinmga3cCFas3v87Zw== X-Google-Smtp-Source: AGHT+IG/DWnMjAnPkhxafMisxXufjnQZtJWxjg/U5SqQWmGFykiHfA6NAjgWBg3fpwFiLbmlbSYUr6eSLlotFECqSqc= X-Received: by 2002:a2e:a906:0:b0:308:f860:7c1 with SMTP id 38308e7fff4ca-32352110986mr29362271fa.30.1746490878039; Mon, 05 May 2025 17:21:18 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Sami Imseih Date: Mon, 5 May 2025 19:21:07 -0500 X-Gm-Features: ATxdqUH2Sb1Ejk6LxvBsSIFLIniRGzWwHykUFdwLP16C-_pZ_cCkObSw-Rp6xvo Message-ID: Subject: Re: POC: Parallel processing of indexes in autovacuum To: Masahiko Sawada Cc: Daniil Davydov <3danissimo@gmail.com>, Maxim Orlov , Postgres hackers Content-Type: multipart/alternative; boundary="000000000000639b0306346c98e2" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000639b0306346c98e2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > On Sat, May 3, 2025 at 1:10=E2=80=AFAM Daniil Davydov <3danissimo@gmail.c= om> > wrote: > > > > On Sat, May 3, 2025 at 5:28=E2=80=AFAM Masahiko Sawada > wrote: > > > > > > > In current implementation, the leader process sends a signal to the > > > > a/v launcher, and the launcher tries to launch all requested worker= s. > > > > But the number of workers never exceeds `autovacuum_max_workers`. > > > > Thus, we will never have more a/v workers than in the standard case > > > > (without this feature). > > > > > > I have concerns about this design. When autovacuuming on a single > > > table consumes all available autovacuum_max_workers slots with > > > parallel vacuum workers, the system becomes incapable of processing > > > other tables. This means that when determining the appropriate > > > autovacuum_max_workers value, users must consider not only the number > > > of tables to be processed concurrently but also the potential number > > > of parallel workers that might be launched. I think it would more mak= e > > > sense to maintain the existing autovacuum_max_workers parameter while > > > introducing a new parameter that would either control the maximum > > > number of parallel vacuum workers per autovacuum worker or set a > > > system-wide cap on the total number of parallel vacuum workers. > > > > > > > For now we have max_parallel_index_autovac_workers - this GUC limits > > the number of parallel a/v workers that can process a single table. I > > agree that the scenario you provided is problematic. > > The proposal to limit the total number of supportive a/v workers seems > > attractive to me (I'll implement it as an experiment). > > > > It seems to me that this question is becoming a key one. First we need > > to determine the role of the user in the whole scheduling mechanism. > > Should we allow users to determine priority? Will this priority affect > > only within a single vacuuming cycle, or it will be more 'global'? > > I guess I don't have enough expertise to determine this alone. I will > > be glad to receive any suggestions. > > What I roughly imagined is that we don't need to change the entire > autovacuum scheduling, but would like autovacuum workers to decides > whether or not to use parallel vacuum during its vacuum operation > based on GUC parameters (having a global effect) or storage parameters > (having an effect on the particular table). The criteria of triggering > parallel vacuum in autovacuum might need to be somewhat pessimistic so > that we don't unnecessarily use parallel vacuum on many tables. Perhaps we should only provide a reloption, therefore only tables specified by the user via the reloption can be autovacuumed in parallel? This gives a targeted approach. Of course if multiple of these allowed tables are to be autovacuumed at the same time, some may not get all the workers, But that=E2=80=99s not different from if you are to manually vacuum in para= llel the tables at the same time. What do you think ? =E2=80=94 Sami > --000000000000639b0306346c98e2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Sat, May 3, 2025 at 1:10=E2=80=AFA= M Daniil Davydov <3danissimo@gmail.com> wrote:
>
> On Sat, May 3, 2025 at 5:28=E2=80=AFAM Masahiko Sawada <sawada.mshk@gmail.com&g= t; wrote:
> >
> > > In current implementation, the leader process sends a signal= to the
> > > a/v launcher, and the launcher tries to launch all requested= workers.
> > > But the number of workers never exceeds `autovacuum_max_work= ers`.
> > > Thus, we will never have more a/v workers than in the standa= rd case
> > > (without this feature).
> >
> > I have concerns about this design. When autovacuuming on a single=
> > table consumes all available autovacuum_max_workers slots with > > parallel vacuum workers, the system becomes incapable of processi= ng
> > other tables. This means that when determining the appropriate > > autovacuum_max_workers value, users must consider not only the nu= mber
> > of tables to be processed concurrently but also the potential num= ber
> > of parallel workers that might be launched. I think it would more= make
> > sense to maintain the existing autovacuum_max_workers parameter w= hile
> > introducing a new parameter that would either control the maximum=
> > number of parallel vacuum workers per autovacuum worker or set a<= br> > > system-wide cap on the total number of parallel vacuum workers. > >
>
> For now we have max_parallel_index_autovac_workers - this GUC limits > the number of parallel a/v workers that can process a single table. I<= br> > agree that the scenario you provided is problematic.
> The proposal to limit the total number of supportive a/v workers seems=
> attractive to me (I'll implement it as an experiment).
>
> It seems to me that this question is becoming a key one. First we need=
> to determine the role of the user in the whole scheduling mechanism. > Should we allow users to determine priority? Will this priority affect=
> only within a single vacuuming cycle, or it will be more 'global&#= 39;?
> I guess I don't have enough expertise to determine this alone. I w= ill
> be glad to receive any suggestions.

What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.

Perhaps we should only p= rovide a reloption, therefore only tables specified=C2=A0
by the user via the reloption can be autovacuumed =C2=A0in parallel?= =C2=A0

This gives a targ= eted approach. Of course if multiple of these allowed tables=C2=A0
are to be autovacuumed at the same time, some may not get al= l the workers,
But that=E2=80=99s not different from= if you are to manually vacuum in parallel the tables=C2=A0
at the same time.=C2=A0

What do you think ?=C2=A0

=E2=80=94
Sami=C2=A0
--000000000000639b0306346c98e2--