Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uo71P-00GVBk-8Y for pgsql-hackers@arkaria.postgresql.org; Mon, 18 Aug 2025 21:04:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uo71N-002FrU-Ja for pgsql-hackers@arkaria.postgresql.org; Mon, 18 Aug 2025 21:04:02 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uo71N-002FrM-92 for pgsql-hackers@lists.postgresql.org; Mon, 18 Aug 2025 21:04:01 +0000 Received: from mail-lf1-x130.google.com ([2a00:1450:4864:20::130]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uo71L-000a1l-1T for pgsql-hackers@lists.postgresql.org; Mon, 18 Aug 2025 21:04:01 +0000 Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-55ce508d4d6so3584064e87.0 for ; Mon, 18 Aug 2025 14:03:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1755551037; x=1756155837; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=B7jvHFuZdHLcO2BCT686NV15IabIwzs5U0jNCHx4mHo=; b=SCJywMxY8URZjHXzxXc3Ibdxuts1UkEI2/2UZYHiCuSRz2rXD7syYB9tNb3KrGgEeP DBaDtHcd2VIu1i1cNVXFb+Odt9ydMmdZFvEUElJ0k82AcjMZx1T7Rrqa/e9/ZRld55eD mKqkSydaT7E3GQh65cppwDNFUS5XD26k6U4t6DzIKrXMYLkS0g3Z4phuQVCY7P7PIwEW Cq1kQRVwBL3y/v2wKJo4agBUqgbaccZaJyLVy++7qdhwotOkeAf+EMws64GTxR0HGA33 SeD1NFKA0Z1+2icf9XEBpUCGRpioRPuyMvp16xe1KBggxYSDtlvIJoQw2HIYzXbnyghH KBqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755551037; x=1756155837; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=B7jvHFuZdHLcO2BCT686NV15IabIwzs5U0jNCHx4mHo=; b=GFisnNnMm3f6FEOJv4EQqCwIpnKrB+Lo0rlEHm2ugqnZL0Ll4vsou+USU/Fl2BzMdK JqYdZIvZSHpXqfAGttt8C7aNGG99N2mSS1KDXU+EERG1QLr5B8UeaI3tJ7/pnJ3m7Qcb pHpwr6oAhSALBgYBLi5QLAg+xqInbO0GIaudSBGz0oW3O3X81BsGHtTgYSisGjCoA3kM yi/exMv58oAkIffLBxT5Go0ehX7/+GQ92OJdA56x2TKQ0c0pF4SXd+AFr0SVVQf4Nvfo nEcfD1VOMsyigPTjoKOfZja3G4ixBfywgq48NCnSvGsfyb1SKG9rfMrFDQJ6txutuowE UAcA== X-Forwarded-Encrypted: i=1; AJvYcCUSyasSsiVc+8WQvS0hvrGELWr7XRHO9ah78luc6UscvW6Abmo+T+sYg8hDnWS5T1wUypYVWa6mVQzXVNEp@lists.postgresql.org X-Gm-Message-State: AOJu0YzcBw3vUN41BmpqT9rvBcygTJkjwvoeQ6j2x88zBrtOPWL2b6E+ Lu/xojnz152F4DLHNjTF80Eqkh5KO33CrC9CgXOUKElol0TdSgKxGPpBwMVHf8YpB37E9yynSiP C6GqbI5NZB0c2D7LeOZ33pKHNNBQxUy80cZOU X-Gm-Gg: ASbGncu6Z7dRoNTR75PW5SHAat2zn9pRUHJm0bpvPOqaS+aOCgI2N1JnScbGXn7fS85 tHzMz9oZqjcORJeyxm83vn3y2ZD3hir9d2x3BTaQmoRcuWRVb5l26b6jwCGKwFZgIA5jCGwGGsy osLWifu+dtYu6t3cgKw9IJJlDkmopYumwAe0amYfLSEefEftlev4DFgyBOdcQArjMbZ8haBrgc6 Fqp+w== X-Google-Smtp-Source: AGHT+IHMeu83WmuHsF2MCmpm/44+hh7fQgOnagmhCN7svgiFX8bJTSSHU5ouMKLROo46R/3YGda0iA8c2ln+wvw2l7A= X-Received: by 2002:a05:6512:3345:b0:55c:e6c9:3499 with SMTP id 2adb3069b0e04-55e0082c041mr73323e87.31.1755551036828; Mon, 18 Aug 2025 14:03:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Masahiko Sawada Date: Mon, 18 Aug 2025 14:03:19 -0700 X-Gm-Features: Ac12FXyUgDexxKgKOdz1PuICxNg-cUcU_n_st0w6rjn2iYdntDb8kLB-0sdlz2s Message-ID: Subject: Re: POC: Parallel processing of indexes in autovacuum To: Daniil Davydov <3danissimo@gmail.com> Cc: Sami Imseih , Matheus Alcantara , Maxim Orlov , Postgres hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Aug 18, 2025 at 1:31=E2=80=AFAM Daniil Davydov <3danissimo@gmail.co= m> wrote: > > > On Fri, Aug 15, 2025 at 3:41=E2=80=AFAM Masahiko Sawada wrote: > > > > > 2. when an autovacuum worker (not parallel vacuum worker) who uses > > parallel vacuum gets SIGHUP, it errors out with the error message > > "parameter "max_stack_depth" cannot be set during a parallel > > operation". Autovacuum checks the configuration file reload in > > vacuum_delay_point(), and while reloading the configuration file, it > > attempts to set max_stack_depth in > > InitializeGUCOptionsFromEnvironment() (which is called by > > ProcessConfigFileInternal()). However, it cannot change > > max_stack_depth since the worker is in parallel mode but > > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't > > happen in regular backends who are using parallel queries because they > > check the configuration file reload at the end of each SQL command. > > > > Hm, this is a really serious problem. I see only two ways to solve it (bo= th are > not really good) : > 1) > Do not allow processing of the config file during parallel autovacuum > execution. > > 2) > Teach the autovacuum to enter parallel mode only during the index vacuum/= cleanup > phase. I'm a bit wary about it, because the design says that we should > be in parallel > mode during the whole parallel operation. But actually, if we can make > sure that all > launched workers are exited, I don't see reasons, why can't we just > exit parallel mode > at the end of parallel_vacuum_process_all_indexes. > > What do you think about it? Hmm, given that we're trying to support parallel heap vacuum on another thread[1] and we will probably support it in autovacuums, it seems to me that these approaches won't work. Another idea would be to allow autovacuum workers to process the config file even in parallel mode. GUC changes in the leader worker would not affect parallel vacuum workers, but it is fine to me. In the context of autovacuum, only specific GUC parameters related to cost-based delays need to be affected also to parallel vacuum workers. Probably we need some changes to compute_parallel_delay() so that parallel workers can compute the sleep time based on the new vacuum_cost_limit and vacuum_cost_delay after the leader process (i.e., autovacuum worker) reloads the config file. > > Again, thank you for the review. Please, see v10 patches (only 0001 > has been changed) : > 1) Reserve and release workers only inside parallel_vacuum_process_all_in= dexes. > 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we = can > release workers even after an error. This required adding a static > variable to account > for the total number of reserved workers (av_nworkers_reserved). > 3) Cap autovacuum_max_parallel_workers by max_worker_processes only insid= e > autovacuum code. Assign hook has been removed. > 4) Use shmem value for determining the maximum number of parallel autovac= uum > workers (eliminate race condition between launcher and leader process). Thank you for updating the patch! I'll review the new version patches. Regards, [1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H= 24AiJeNoPGCmZNSwL1YA%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com