Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uCOdu-00EZMZ-AO for pgsql-hackers@arkaria.postgresql.org; Tue, 06 May 2025 20:11:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uCOdt-009jqP-99 for pgsql-hackers@arkaria.postgresql.org; Tue, 06 May 2025 20:11:53 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uCOds-009jqG-VG for pgsql-hackers@lists.postgresql.org; Tue, 06 May 2025 20:11:52 +0000 Received: from mail-lj1-x229.google.com ([2a00:1450:4864:20::229]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uCOdq-000VGf-0k for pgsql-hackers@lists.postgresql.org; Tue, 06 May 2025 20:11:52 +0000 Received: by mail-lj1-x229.google.com with SMTP id 38308e7fff4ca-30eef9ce7feso59802121fa.0 for ; Tue, 06 May 2025 13:11:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746562309; x=1747167109; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kh8H0gCL9N5KW6w26kd2uViuFFPJo3TckNgocQccn10=; b=D+0nzNUPyOLpsPJvsYx5onGD/pG+K9Fb1qtln3hRfu2Zy8eHoNjK9fSY5LagNcYDxU SGqwoj3dUgEgJC33Q9VQXavnVxBP7AcW47OM6kK9wFbPTeHAesFWelsxPCdwQ6bhOX4X kZw1IYrO8S/H8gwz5mSJAR4Fq+b6/iVRFpj9MK5zw7uo/xuEm56TC+gOgsCt/ncxKNMd w2UsNiLlXViJm49HBu1mKS8LQmwbiTcN/ehi3DUFavHD8/AeYHAV6dCkoQ+DfMQE/uZ8 pPB/cOh+Sb+qJQglHnX3kr6FcO2AZ2Zev8aIxxlho1UZB4oD4zNDR+attrSz16s6Q9f9 461A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746562309; x=1747167109; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kh8H0gCL9N5KW6w26kd2uViuFFPJo3TckNgocQccn10=; b=hqzISZ/yBWSBEgdUkcHUc9IYmekVZBTYxMrM1Gxof92xLss+YrcLiM96P23usjryFQ qlg1bV30GDCTNYGzMSJgAPYuBlCsBPP47XO304tINviKdJcP13De9YIApl634hpyApi5 t7Bu5FDD1mgYYiQ5LzRhueQYxNSsUWBsWBlsHHV86oOTQoEjRNhBIRleL7vlAzyyh8RZ CUc0+QHGBbYUDDJ/EDfB7Jxyi4FTvRxKu3pDVmTbyoEP5zUY6StecgiIsMLkqwBvRnHm +O/3YaDBmxXsOhYjMG7n6JsWBfgZ+gPdu7sFboXZUd/W09M6Pt/XTdBkHLuz4gPNTswx v3pA== X-Forwarded-Encrypted: i=1; AJvYcCUgNF0HxBBZYPdy+z86A5pe0ZFF7sFIEg0/s6PgeuHirRjl1sO+QaOThQoIgiCA0S8hJ6qrxAe9hD4w7m7e@lists.postgresql.org X-Gm-Message-State: AOJu0YxDszyx2oPbIZxAsQdMA9FGEeWCggAsKrqdWuwZC1vAl4RcX86t mMRxTFLqzxI49L7j4XPEAgXpwFz57bAguch+aj+uARKDQ2XwGbvoeJwEkX0o6kOyDbQSCkW1mGU i0dz52jGLNAO2NyP7HA+RZtvu0GI= X-Gm-Gg: ASbGncvjeREQThgsBUs6AK38wqIbwwvKrxn5nYnaQVmd0eX8G9kl7M7XY4cuWj27RuY PZnCaRxH/4YbTcEM8pS298YS5ndBIW48W1avnWEGZL6YHJ0I1672dQ67qfUTwcW6etddtyEZOJ0 NOuG+c+mmAUwJFgIYaxfp4 X-Google-Smtp-Source: AGHT+IFvJLNkPhmvXP3RX9lkxO91URCt6kThdzBQbq3ns+h6rdyw4skYNfcopQMl8vd47rNkp1W53jn4mUP4P9m/UjE= X-Received: by 2002:a2e:ad83:0:b0:30c:160b:c741 with SMTP id 38308e7fff4ca-326ad18dd70mr1436781fa.6.1746562309249; Tue, 06 May 2025 13:11:49 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Sami Imseih Date: Tue, 6 May 2025 15:11:38 -0500 X-Gm-Features: ATxdqUFE4T94s93ORtzFRNCA0S3-pcEsmun9Pt3gXM3Q6lrOWFl7gAG3Z72bTG8 Message-ID: Subject: Re: POC: Parallel processing of indexes in autovacuum To: Masahiko Sawada Cc: Daniil Davydov <3danissimo@gmail.com>, Maxim Orlov , Postgres hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > On Mon, May 5, 2025 at 5:21=E2=80=AFPM Sami Imseih = wrote: > > > > > >> On Sat, May 3, 2025 at 1:10=E2=80=AFAM Daniil Davydov <3danissimo@gmai= l.com> wrote: > >> > > >> > On Sat, May 3, 2025 at 5:28=E2=80=AFAM Masahiko Sawada wrote: > >> > > > >> > > > In current implementation, the leader process sends a signal to = the > >> > > > a/v launcher, and the launcher tries to launch all requested wor= kers. > >> > > > But the number of workers never exceeds `autovacuum_max_workers`= . > >> > > > Thus, we will never have more a/v workers than in the standard c= ase > >> > > > (without this feature). > >> > > > >> > > I have concerns about this design. When autovacuuming on a single > >> > > table consumes all available autovacuum_max_workers slots with > >> > > parallel vacuum workers, the system becomes incapable of processin= g > >> > > other tables. This means that when determining the appropriate > >> > > autovacuum_max_workers value, users must consider not only the num= ber > >> > > of tables to be processed concurrently but also the potential numb= er > >> > > of parallel workers that might be launched. I think it would more = make > >> > > sense to maintain the existing autovacuum_max_workers parameter wh= ile > >> > > introducing a new parameter that would either control the maximum > >> > > number of parallel vacuum workers per autovacuum worker or set a > >> > > system-wide cap on the total number of parallel vacuum workers. > >> > > > >> > > >> > For now we have max_parallel_index_autovac_workers - this GUC limits > >> > the number of parallel a/v workers that can process a single table. = I > >> > agree that the scenario you provided is problematic. > >> > The proposal to limit the total number of supportive a/v workers see= ms > >> > attractive to me (I'll implement it as an experiment). > >> > > >> > It seems to me that this question is becoming a key one. First we ne= ed > >> > to determine the role of the user in the whole scheduling mechanism. > >> > Should we allow users to determine priority? Will this priority affe= ct > >> > only within a single vacuuming cycle, or it will be more 'global'? > >> > I guess I don't have enough expertise to determine this alone. I wil= l > >> > be glad to receive any suggestions. > >> > >> What I roughly imagined is that we don't need to change the entire > >> autovacuum scheduling, but would like autovacuum workers to decides > >> whether or not to use parallel vacuum during its vacuum operation > >> based on GUC parameters (having a global effect) or storage parameters > >> (having an effect on the particular table). The criteria of triggering > >> parallel vacuum in autovacuum might need to be somewhat pessimistic so > >> that we don't unnecessarily use parallel vacuum on many tables. > > > > > > Perhaps we should only provide a reloption, therefore only tables speci= fied > > by the user via the reloption can be autovacuumed in parallel? > > > > This gives a targeted approach. Of course if multiple of these allowed = tables > > are to be autovacuumed at the same time, some may not get all the worke= rs, > > But that=E2=80=99s not different from if you are to manually vacuum in = parallel the tables > > at the same time. > > > > What do you think ? > > +1. I think that's a good starting point. We can later introduce a new > GUC parameter that globally controls the maximum number of parallel > vacuum workers used in autovacuum, if necessary. and I this reloption should also apply to parallel heap vacuum in non-failsafe scenarios. In the failsafe case however, all tables will be eligible for parallel vacuum. Anyhow, that discussion could be taken in that thread, but wanted to point that out. -- Sami Imseih Amazon Web Services (AWS)