Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w0OsY-001sKM-1g for pgsql-hackers@arkaria.postgresql.org; Wed, 11 Mar 2026 19:05:58 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w0OsW-00BRP9-2S for pgsql-hackers@arkaria.postgresql.org; Wed, 11 Mar 2026 19:05:57 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w0OsW-00BRP1-1Q for pgsql-hackers@lists.postgresql.org; Wed, 11 Mar 2026 19:05:56 +0000 Received: from mail-lf1-x134.google.com ([2a00:1450:4864:20::134]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w0OsU-00000002BQZ-1ch5 for pgsql-hackers@lists.postgresql.org; Wed, 11 Mar 2026 19:05:56 +0000 Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-5a159c1e65aso278391e87.0 for ; Wed, 11 Mar 2026 12:05:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773255953; cv=none; d=google.com; s=arc-20240605; b=PTHdogu75uNqXWkUw8TD4doRWVzj4urTek9t8hptuLFk3VF7Aro6UDeyj4lYviTvjE F3swjd+yETbeEYKRtdiRn99xwaP8xzrdziRVYOce4I7ACViMplBUxh12oo/GeLLX1rKu qUqY+QjCXgUhvVrGsE3xCYV5MxdviwEr00hHwExHFg5qj7g+gGfRhFOHof4M5H+17pBd qPGZ08g8hWSO16Na50y6kEh0CIzYcxmc8Ak0QbNE1VJD7fJZJPBMAkI14CHgdLnBVOQe /DbZdijcEtEZNrdTbFnOYA51zX9sSq5kjdvvUMmmjwV+9zYF8d+LA9ken2CjsAM2YmzP 68Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=GmPznuyI0FMhmUEXUv1q730uq7Y7Sqfv2oTw+6EB9F4=; fh=aI3I/TIJ/F1a+ut41KniXKr251E51B7Iky7GmcFs6I4=; b=SRT65/p22xnNMCyGymNnVf86lxF108QV8zHNWuCvNYbXsnTlixzpW1hbRCYbxTX8pJ ZR/da6APo/KNQeFzU7gU2DebA0vZZXaFoVMgki/qYM5RFphVDdZz4i3rM/VPjAsdK3hv 504O4PtyhGpBgKQHsCvQ+aF1nrapb6jA1Ih7GhwV7Xc9qMkDdJqTnzpmcdN6oFe6IYdB vkb6LUW5vCy6dC4R1ckgUC+tp02Lrp+EWaXrsa+dnbt+LKAueobQQcxI4bbDan6f+pWi 3oZqZY7S2IYTB6o3wnv4SObj09vHG9fcMqx9DJ+PplyDTD1AwIjJ16EyMSNG9ZH2tyEX nhfA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773255953; x=1773860753; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GmPznuyI0FMhmUEXUv1q730uq7Y7Sqfv2oTw+6EB9F4=; b=SP3OnXSnMf4rx/4LbrhkMBPiYRA/S/lsloe+NB8Jtca0CyZerbj/X/7lkGDmFMjloC mrDeWGf0lunOdG+LOKbIw7+f/pmc4eheSCUqvqmkoo3f9ETbwar8uZmaKhZ/SZHgXZ0z 6HAtITWN4p0uO57xSRXLCldQ5Gjg52iw/ChLOLnEIybegQLMGEsASx/eQTe9LkCq0uz4 rXibPjbl3RgK+ITkXgv45+Z6eqv50HRIX+Iy1+iU2aDpn78Lj70CPajSvRe1k0tiQO5Q WWV3Z5T7sVYvtewqLQvGfWVw0trg7XdNusvUv7cf6t9NeIFl10hGZRjHTxv0KnqgPqCu 2oDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773255953; x=1773860753; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=GmPznuyI0FMhmUEXUv1q730uq7Y7Sqfv2oTw+6EB9F4=; b=hWO9w/aYqEpXjot2y70iWZ0SOiPBJzejnpvMwZ/IyVbdQFHfSPSpM8QeobPTmg2Xmd LbiGrJuQfNu6jdy5P1wPl+gNAny6MR6Wl38w0OyoFqb0l5x91+nCED+4u75Z6AmcGlEM eE8Sqvk+WtBo0FD70fPVucPnxuD+85xJnP/ZRmQJQOkB5RwriaxPWoZ8Dks+Ln4YockT n1EpqT8OsgCDRkk9nHSGlDZyFwwn2IkCN42NOKKGXgNdDXhZLiEFf4k+zhmoDaivpBhO az7JIqLWSutQeBzrbypYFpYOYAOvX6NiKC+uxzfne5dDjaLDYGLtJD/1c4Jm/ZC9xKUq AR9g== X-Forwarded-Encrypted: i=1; AJvYcCWDg8SzW5NR6EA1n39XhkUiiVHvkpYnFpHPUaeH47aOMrYNtDBo5ToTIFv7vLKWGSK1AddNQ/2DWZPL2rAV@lists.postgresql.org X-Gm-Message-State: AOJu0YyoSbgJ8IooRNkDeitumpVQrYfpay8ahhu9mT/1TYpbptNX8kPR Gew+OU+w5DCoN3JEJYFCAxnSE+5/Rleb8tOctIL859jXS8TroSF1sRK/g3BzUghf+WclTxd1JyL 5qLmbwZHX9itoGic/2YPn6x6t+o4/UDA= X-Gm-Gg: ATEYQzzsar+yIYGBA1uzpq1nM7DxQqXBzBq9Tj0knrsJMWVoQnjf3MdWAG9ZEjgw2yN xIJRE0eYlXdFe9vzKA+MHJFYOznehHeEDI1BvJGrWTo7ZQgftVi7znfu4zjRGycdv9NCOYVNqav RnAurgAUoznLbYWiL3eqzKknYMu4L9nbPPuaPfF4KosaNvbDfOvUp2zr4Wtw505Rap1/dpp6Tje pCcx4jGgzQTlh4AtD7HwjEtUeGl10oT3OdWjH56J172bL/3lHpPJFaJS8b3UpFV3zMEaGiPhsWj 661Y7lquEh0kwg92unc= X-Received: by 2002:ac2:455b:0:b0:5a1:3ef7:366c with SMTP id 2adb3069b0e04-5a156dcd31dmr996126e87.48.1773255953143; Wed, 11 Mar 2026 12:05:53 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Masahiko Sawada Date: Wed, 11 Mar 2026 12:05:16 -0700 X-Gm-Features: AaiRm525IFOGkzcX34gY9ylDL7phf6Ozj1ul8FnKf5WAGUWAWjbuJFdaCp2mp98 Message-ID: Subject: Re: POC: Parallel processing of indexes in autovacuum To: Daniil Davydov <3danissimo@gmail.com> Cc: Sami Imseih , Alexander Korotkov , Matheus Alcantara , Maxim Orlov , Postgres hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Mar 11, 2026 at 4:28=E2=80=AFAM Daniil Davydov <3danissimo@gmail.co= m> wrote: > > > > While I agree that showing only two numbers might lack some > > information for users, I guess the same is true for > > max_parallel_maintenance_workers or other parallel queries related to > > GUC parameters. For instance, suppose we set > > max_parallel_maintenance_workers to 2, if the table has (large enough) > > 4 indexes, we would plan to execute a parallel vacuum with 2 workers > > instead of 4 due to max_parallel_maintenance_worker shortage and it's > > even possible that only 1 worker can launch due to > > max_worker_processes shortage. In this case, we currently consider > > that 2 workers are planned. Isn't it the same situation as the case > > where we reserved 2 parallel vacuum workers for autovacuum for the > > table with 4 indexes? > > I don't think that examples with other "max_parallel_" parameters will be > appropriate, because these parameters are limiting the number of parallel > workers for *single* operation/executor node/... . At the same time, > av_max_parallel_workers limits the total number of parallel workers acros= s > all a/v leaders. > > Regarding the situation that you provided : > The number of planned workers is reduced inside the > parallel_vacuum_compute_workers due to the max_parallel_maintenance_worke= rs > limit. I.e. we cannot plan more workers than required by the config, and > it's completely OK No one expects the number of "planned workers" to be m= ore > than max_parallel_maintenance_workers. > > IMO there is no need to make efforts to track the shortage of > max_parallel_maintenance_workers for the VACUUM (PARALLEL), because this > parameter just plays the role of a limiter. We will consider only the > shortage of max_parallel_workers, that can be determined by looking at > "planned vs. launched". > > And here is a difference with a parallel autovacuum : > av_max_parallel_workers is considered twice : in the > "parallel_vacuum_compute_workers" and "ReserveWorkers" functions. > So the low number of launched workers can be explained by the shortage of > both av_max_parallel_workers and max_parallel_workers. Since we want to > distinguish between these cases, we have added the "nreserved" concept. > > I see that few modules can report something like "out of background worke= r > slots" when they cannot launch more workers due to max_parallel_workers > shortage (but modules depending on the "parallel.c" logic don't do so). > This fact gave me another idea : > If we don't want to log "nreserved" or some other similar value, maybe > we should add logging after the "ReserveWorkers" function? I.e. if some > workers cannot be reserved, we can emit a log like "out of parallel > autovacuum workers. you should increase the av_max_parallel_workers > parameter". Having this log can help the user distinguish between > max_parallel_workers/av_max_parallel_workers shortage situations. > What do you think? My point is that the process of determining the number of workers planned to launch is somewhat unclear to users in both cases. We consider not only GUCs such as max_parallel_maintenance_workers but also index AM definitions (i.e., amparallelvacuumoption) and index sizes etc. But I agree that providing more detailed logs might help users understand and notice the av_max_parallel_workers shortage. BTW thes discussion made me think to change av_max_parallel_workers to control the number of workers per-autovacuum worker instead (with renaming it to say max_parallel_workers_per_autovacuum_worker). Users can compute the maximum number of parallel workers the system requires by (autovacuum_worker_slots * max_parallel_workers_per_autovacuum_worker). We would no longer need the reservation and release logic. I'd like to hear your opinion. > > Summary : > 1) > I think that we should not look at maintenance vacuum while > considering how to inform the user about parameters shortage for autovacu= um, > because we have a more complicated situation in case of autovacuum. > 2) > I suggest adding a separate log that will be emitted every time we are > unable to start workers due to a shortage of av_max_parallel_workers. For (2), do you mean that the worker writes these logs regardless of log_autovacuum_min_duration setting? I'm concerned that the server logs would be flooded with these logs especially when multiple autovacuum workers are working very actively and the system is facing a shortage of av_max_parallel_workers. > > > * 0004 patch: > > > > Can we write the same test cases while not relying on the 0002 patch > > (i.e., worker usage logging)? We check the worker usage log at two > > places in the regression tests. The idea is that we write the number > > of workers planned, reserved, and launched in DEBUG log level and > > check these logs in the regression tests. The patch 0001, 0003, and > > 0004 can be merged before push while we might want more discussion on > > the 0002 patch. > > Possibly we can introduce a new injection point, or a new log for it. > But I assume that the subject of discussion in patch 0002 is the > "nreserved" logic, and "nlaunched/nplanned" logic does not raise any > questions. > > I suggest splitting the 0002 patch into two parts : 1) basic logic and > 2) additional logic with nreserved or something else. The second part can= be > discussed in isolation from the patch set. If we do this, we may not have= to > change the tests. What do you think? Assuming the basic logic means nlaunched/nplanned logic, yes, it would be a nice idea. I think user-facing logging stuff can be developed as an improvement independent from the main parallel autovacuum patch. It's ideal if we can implement the main patch (with tests) without relying on the user-facing logging. Regards, --=20 Masahiko Sawada Amazon Web Services: https://aws.amazon.com