Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w01b0-001XLC-1G for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Mar 2026 18:14:18 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w01ax-0059rX-1l for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Mar 2026 18:14:16 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w01aw-0059rO-2r for pgsql-hackers@lists.postgresql.org; Tue, 10 Mar 2026 18:14:15 +0000 Received: from mail-lf1-x134.google.com ([2a00:1450:4864:20::134]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w01au-00000001UKG-3itH for pgsql-hackers@lists.postgresql.org; Tue, 10 Mar 2026 18:14:14 +0000 Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-5a107b387a5so8481218e87.2 for ; Tue, 10 Mar 2026 11:14:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773166451; cv=none; d=google.com; s=arc-20240605; b=DP1F8fMoBZRfzlYWE8NV1BnbMIrnD8iY4SW2/gv5RGJreUzWxpRyUqbci4Umz7pVY1 JfXsGfNSNN7ZTSZGcs3Y/uHa2MJif/czFhEkpFynigg51lUPJFA3Qu59S4gj4kh1yv1v geDfEqncE0M5jZ3OBK0mJBBIPFNUdQiG+IAZklsH7fCZsa/8/tuXUfN5Zqggidlq31+Y Qt/VZLciFbzeO4lGXupVWvXOISYQ5S7RCyy2XMU3ZIPfMdElxZUuJiAc13l9P7JDpLp3 /YcA59KFTP0m4x+9459UkNtHYjakv3KIyOLzXb0lB9uuLAwKzaLSlxrwpePHtRlooNOV KtkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=iAn9Imw7y4XEXytR2MsgWdbAOpbV5nwRJWVosnZ9nk8=; fh=67Azv/Dhz0sfK5dJ7drNrTp7PJKhaknlkVjPCexLbUM=; b=kY9oAhOX68z3t/Cdbu4vjI08VjkzNSHLHe+blSVBO8S0g7dYhtU2U2D371TsSeMrkE w16RRoJm4yieBLx0VM+d5uTn4aQJHJeSqcGBOM+3xiB6TcfcjibGE8WkRD4iZYESBe1Y k7S1a1HbtsHQHEkxnX6LFlfufpd/7uul2sEWJc4SgyNjB2UWjDPd/vobsx/XyuTW9iI6 g+8itS4nSgA+IEyPv670VL5U0qGxbBSi0lZlO1pyVsuqJMpD4QL2eUXRt9Ixo0a0lfMT YR+vZgGlqdR89B05iZgsBvRfPyxzHA5UKU+ogVgQtnptUaG6kUyvciYmD2+TtcBqxkX5 dLRA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773166451; x=1773771251; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=iAn9Imw7y4XEXytR2MsgWdbAOpbV5nwRJWVosnZ9nk8=; b=jI4Tjn+//uyG7TTXebr8PDRKd6/v0DXf1IMspyKzeAagGNO5nIyJOkIfPPBSv+WUl5 1V0pYewJzsD2odmtr45B/O4X/hD7VLTQFxQuHSqZ6C9u/R+z+JUa7tJ6iqwfBozB1AsC nTJbXO5NWT2I16PPRBGq6Pj0jLK3B+3B+3vE/0nODKmHjI4JKRvzbQ72EkeTAeK/ZaqN 5vXvQWXsrGbTlljVb2Kst1482unqg0GKu1Su/UKp8yP0k4gnTMtvCkQk5RJucBK5tUoQ mn0+afEVolab4FKGrC4vgZUjSHLo8R5uI7+k4xJz1bEZhKMe5BZcG5suf/WD4U5jx2Uh c5ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773166451; x=1773771251; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=iAn9Imw7y4XEXytR2MsgWdbAOpbV5nwRJWVosnZ9nk8=; b=wW0wFCP/F2DmLYlnUhshUxgxPqxhLl6VQtAHoe+OhDCi5mF7xbeCEz87NuSspAsLuB SXBRLr3O/iIzgpCwVAFGhkDJk8puIYAiB05Xx8C+I59oMzaI5sbjew5WN5Siz7ABAG2i +axu7C0ixhDvGMgcY/MhhlILWCQa3YvWksddTkksNAN6HVHIDGRAubuC3VNVpZkt3UE+ fCXcfZHeVCv7ULSmMI860S6dKks8/CiEJ5EwetcUc+v4F0m1DnZsaGtpx4JyfuIlt53i LeWs5PPP1h+2/Jd56PlpeIGD3vlBY7m2b2wlwbTQVxDuF8KK07sN2+gieNVSHvi+T7mx Mh6A== X-Forwarded-Encrypted: i=1; AJvYcCUTgfNL+OFPPh55k6qEx46zK7P+mrvjSJGcyD0rwGRbCIaQovcLgB7vMi40gvlrrTfmFftaCVEKdwT57GZF@lists.postgresql.org X-Gm-Message-State: AOJu0YztCPSlaO8rQ18NiZqAoeRB1uvtVhpaNG8GxvIMZVxk2VnRhduu czi8r5YQ/HbfRSjfai2URd5hbqyOAdX3rKMRKL4XKCi+7eHEZQ0AMNQKqVTZfjGK5hYSArvb9GV wwhO7ksUqa1Wh77hEmQDZwJy0c97V6rfZeQ== X-Gm-Gg: ATEYQzyuFFjVhj1ty6eLzuB6IhidPM/PkDyDGL/3MyjtJiNoeBlzoLKXKRID6vpx3p6 /Hn+5EBPAkkFHF9h+XwjEduRmMuP0CJDqkQRK1a9gl02Og2s70OioDLscjnrzZUoH8tSAj2Olz6 RrxBBGkw3WzHjiRFso0MlIpAMySKTtmbGBfgYbV/ojRXkoPKGGUN2sjR5M78LKgDr8asY/Eff7H cEuq4KmPvHg2+51rMMCzGp0I0wKLOyVwva9q0x0y8WF921Z+A0Ifhd6k4rZPcNcnWAq6NbyyUeS tgtwqVF8 X-Received: by 2002:a05:6512:10d5:b0:5a1:18a0:8552 with SMTP id 2adb3069b0e04-5a13caadcbfmr6656732e87.9.1773166450908; Tue, 10 Mar 2026 11:14:10 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Masahiko Sawada Date: Tue, 10 Mar 2026 11:13:34 -0700 X-Gm-Features: AaiRm500Zd_GgUMSdmGKPDGZiJnhgDa7i4_kzkyY_xqeqXE-7Z3k12qsl1A5zTo Message-ID: Subject: Re: POC: Parallel processing of indexes in autovacuum To: Daniil Davydov <3danissimo@gmail.com> Cc: Sami Imseih , Alexander Korotkov , Matheus Alcantara , Maxim Orlov , Postgres hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Mar 3, 2026 at 10:59=E2=80=AFPM Daniil Davydov <3danissimo@gmail.co= m> wrote: > > Hi, > > On Tue, Mar 3, 2026 at 5:26=E2=80=AFAM Masahiko Sawada wrote: > > > > On Sun, Mar 1, 2026 at 6:46=E2=80=AFAM Daniil Davydov <3danissimo@gmail= .com> wrote: > > > > > > Thus, a/v leader cannot launch any workers if max_parallel_workers is= set to 0. > > > > Right. But this fact would actually support that limiting > > autovacuum_max_parallel_workers by max_parallel_workers is more > > appropriate, no? > > > > av_max_parallel_workers is really limited by max_parallel_workers only > during shmem init. After that we can change it to a value that is higher > than max_parallel_workers, and nothing bad will happen (obviously). > > So, my point was : why should we have this explicit limitation if it > 1) doesn't guard us from something bad and 2) can be violated at any time > (via ALTER SYSTEM SET ...). > > Now it seems to me that limiting our parameter by max_parallel_workers is > more about grouping of logically related parameters, not a practical nece= ssity. I believe there is also a benefit for users when they want to disable all parallel behavior. If av_max_parallel_workers is in max_parallel_worker group, they would have to set just max_parallel_workers to 0. Otherwise, they would have to set both max_parallel_workers and av_max_parallel_workers. > > > > I suppose to do the same as we did for try/catch block - add logging = inside > > > the "autovacuum_worker_before_shmem_exit" with some unique message. > > > Thus, we will be sure that the workers are released precisely in the > > > "before_shmem_exit_hook". > > > > > > The alternative is to pass some additional information to the > > > "ReleaseAllParallelWorkers" function (to supplement the log it emits)= , but it > > > doesn't seem like a good solution to me. > > > > I'm not sure if it's important to check how > > AutoVacuumReleaseAllParallelWorkers() has been called (either in > > PG_CATCH() block or by autovacuum_worker_before_shmem_exit()). We > > would end up having to add a unique message to each caller of > > AutoVacuumReleaseAllParallelWorkers() in the future. I guess it's more > > important to make sure that all workers have been released in the end. > > > > In that sense, it would make more sense to check that all workers have > > actually been released (i.e., checking by > > get_parallel_autovacuum_free_workers()) after a parallel vacuum > > instead of checking workers being released by debug logs. That is, we > > can check at each test end if get_parallel_autovacuum_free_workers() > > returns the expected number after disabling parallel autovacuum. > > > > Sure, at first we want to check whether all workers have been > released. But the ability to release them precisely in the try/catch > block is also important, because if it doesn't - a/v worker can "hold" > these workers until it finishes vacuuming of other tables (which can > take a lot of time). Such a situation will surely degrade performance, > so I think that we must check whether we can release workers precisely > during ERROR handling. Do you agree with it? I agree that we need to make sure that parallel workers are released even during ERROR handling, but I don't think it's important to check the places where AutoVacuumReleaesAllParallelWorkers() is called, by using regression tests. It's more important and future-proof that we check if all workers are released according to the shmem data. In other words, even if we call AutoVacuumReleaseAllParallelWorkers() in an unexpected call path in an ERROR case, it's still okay if we successfully release all workers in the end. These regression tests should test these database behavior but not what specific code path taken. If we can check if all workers are released by checking the shmem, why do we need to check further where they are released? > > I understand your concerns about adding a unique log message for each > ReleaseAll call. But I cannot imagine a new situation when we need to > emergency release workers. If you think that it might be possible, I can > propose adding a new optional parameter to the "ReleaseAll" function - > something like "char *context_msg", which will be added to the elog place= d > inside this function. I think we should not make the function complex just for testing purposes. My point is that what we should be testing is the behavior -- specifically whether parallel workers are released at the expected timing -- rather than focusing on whether a specific code path was executed. > > > On second thoughts on the "planned" and "reserved", can we consider > > what the patch implemented as "reserved" as the "planned" in > > autovacuum cases? That is, in autovacuum cases, the "planned" number > > considers the number of parallel degrees based on the number of > > indexes (or autovacuum_parallel_workers value) as well as the number > > of workers that have actually been reserved. In cases of > > autovacuum_max_parallel_workers shortage, users would notice by seeing > > logs that enough workers are not planned in the first place against > > the number of indexes on the table. That might be less confusing for > > users rather than introducing a new "reserved" concept in the vacuum > > logs. Also, it slightly helps simplify the codes. > > Yeah, it sounds tempting. But in this case we're shifting more responsibi= lity > to the user. For instance : > If av_max_workers =3D 5 and there are two a/v leaders each of which is tr= ying > to launch 3 parallel workers, we will see logs like "3 planned, 3 launche= d", > "2 planned, 2 launched". IMHO, such a log doesn't imply that there is a > shortage of workers. I.e. this is the user's responsibility to notice tha= t the > second a/v leader could launch more than 2 workers for processing of the > table with (N + 2) indexes. > In this case even our previous version of logging will give more informat= ion > to the user : "3 planned, 3 launched", "3 planned, 2 launched". > > If we don't want to create a new "reserved" concept, maybe we can rename > it to something more intuitive? For example, "n_abandoned" - number of > workers that we were unable to launch due to av_max_parallel_workers > shortage. If n_abandoned is 0 and n_launched < n_planned, the user can > conclude that he should increase the max_parallel_workers parameter. > And vica versa, if n_launched =3D=3D n_planned and n_abandoned > 0, the > user can conclude that he should increase the > autovacuum_max_parallel_workers parameter. > > What do you think? While I agree that showing only two numbers might lack some information for users, I guess the same is true for max_parallel_maintenance_workers or other parallel queries related to GUC parameters. For instance, suppose we set max_parallel_maintenance_workers to 2, if the table has (large enough) 4 indexes, we would plan to execute a parallel vacuum with 2 workers instead of 4 due to max_parallel_maintenance_worker shortage and it's even possible that only 1 worker can launch due to max_worker_processes shortage. In this case, we currently consider that 2 workers are planned. Isn't it the same situation as the case where we reserved 2 parallel vacuum workers for autovacuum for the table with 4 indexes? > > **Comments on the 0001 patch** > > > * of the worker list (see above). > > @@ -299,6 +308,8 @@ typedef struct > > WorkerInfo av_startingWorker; > > AutoVacuumWorkItem av_workItems[NUM_WORKITEMS]; > > pg_atomic_uint32 av_nworkersForBalance; > > + uint32 av_freeParallelWorkers; > > + uint32 av_maxParallelWorkers; > > } AutoVacuumShmemStruct; > > > > We should use int32 instead of uint32. > > I don't mind, but I don't quite understand the reason. We assume that the > minimal value for both variables is 0. Why shouldn't we use unsigned > data type? Unsigned integers should be used for bit masks, flags, or when we need to handle more than INT_MAX. Signed integers are preferable in other cases as we're using signed integers for controlling the number of workers and autovacuum_max_parallel_workers is defined as signed int (which could be stored to AutoVacuumShmem->av_maxParallelWorkers). Here are some review comments. * 0001 patch: + /* Cannot release more workers than reserved */ + Assert(nworkers <=3D av_nworkers_reserved); I think it's better to use Min() to cap the number of workers to be released by av_nworkers_reserved as Assert() won't work in release builds. * 0004 patch: Can we write the same test cases while not relying on the 0002 patch (i.e., worker usage logging)? We check the worker usage log at two places in the regression tests. The idea is that we write the number of workers planned, reserved, and launched in DEBUG log level and check these logs in the regression tests. The patch 0001, 0003, and 0004 can be merged before push while we might want more discussion on the 0002 patch. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com