Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vw9bH-0061jb-2o for pgsql-hackers@arkaria.postgresql.org; Sat, 28 Feb 2026 01:58:36 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vw9aF-008sdy-1H for pgsql-hackers@arkaria.postgresql.org; Sat, 28 Feb 2026 01:57:31 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vw9aE-008sdU-2V for pgsql-hackers@lists.postgresql.org; Sat, 28 Feb 2026 01:57:31 +0000 Received: from mail-lj1-x22b.google.com ([2a00:1450:4864:20::22b]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vw9aB-00000001bM0-0VQV for pgsql-hackers@lists.postgresql.org; Sat, 28 Feb 2026 01:57:29 +0000 Received: by mail-lj1-x22b.google.com with SMTP id 38308e7fff4ca-389fad34e2eso33385551fa.3 for ; Fri, 27 Feb 2026 17:57:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1772243845; cv=none; d=google.com; s=arc-20240605; b=gIoHcGMvYcp8e/cO/orWBZTKdaZFIOBFfSBGTpFaJHQr55Lv8CN2PtDn8XA/MX3K6T vkKaL1Ix0NjSME5rkekvsiK0xygtb3K4GXPKqXN/QGn8n5Y+99luAZiK1EgcpuHke5iJ JK/bd4RHsopdAQuk1s9sxpkIOQ+8MVegG1427sz7sGopxF0lJpTSNDzc9wIueZ7GkuMI KGBc3h2iVeBAZjl/ARviCPR9+/xb272gLM/72WubA80+h0XwlvwiFOqvPK5guvlIgqEy 8AOVHW6YQPurjzSz7J+/1SzNcT3XEsFMCRk7rzbazIfr95QRP4h0//c4I8hy3mbcfdd2 wRDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=KszxNvM98PPhBbBFV3STCfSu4IC1+lhXHFAqfl7BRbw=; fh=YSpBmZCyNiI2cAQA1cynJgpEqRHI8Z8Utau3q4CmAZs=; b=PBCii1OCWxWw8/A3FfPyMriI248i1bzZtR1+xT1g8IgfwDp4xcXXCcG113/4P+pRBO YyZIxF72foKUReZTkwRF3h1ueycs4vmHzVmCxdmk3MaChLvWR/8ZP9QE6RvL+wWVZmcQ Owp7W0LCpinWe2r0Vd0GleqiuQ9WliHlFXFr1GVEYuZZDfM3bSWt0jCm54a4d74ON5IS Unr5k8lqALDpuUQJ20glaldweDvXj1kjRW/l5mdWGxCmNsyqsQq7rLp5mLQxPlX5t9Ck 3vmMopocgo3HumCBETupJqhP3NDAFI8YrkSfRflK6bHBQ/l7rMoXvxcCXU2FWhNhy2MU 5ESw==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772243845; x=1772848645; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=KszxNvM98PPhBbBFV3STCfSu4IC1+lhXHFAqfl7BRbw=; b=X34lkayI45649fW3P/y79OK2S2+BCx80fThFK4zBCm0M1bnXZzw3i99kCQHWqLhBy0 qSb9ZAHQ42IAq3pwNIM6q0JnPvF1HYd/mMjOAC9wz7l43CtM8fQbfYJpKXQLz9dBWfe4 FTp7Hmrs0EY3R7AMHeiGMnqkX15Jiq6ruMsKPlMYwh4MiaoYZa/mIjfwtdduLsPivemF v3U1izsyrZpSz/NPU6GC6kyAzKoiZ2xlYksPOPyNXVZklxEnAXKTfC7/n0Lq4cWdS5ks m+v764BPESYlytQk/B00sqUL2E6Zb1PWkr2zbMuTQBqO7JB1n8GBlRz87/0K+k62DpKl 5rFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772243845; x=1772848645; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=KszxNvM98PPhBbBFV3STCfSu4IC1+lhXHFAqfl7BRbw=; b=u+lWN/n5HQQ+J+D/04Y5PVfXUOf/mdi1RC48oRfdN7jobo+6hl/tGoJn+sPBE2MvLO Qg5wOneMujglkqVxmZNeyPUVZJ+5u47NwmgBj1wMhScKLWnG1UnHGZkLehjl9RwvYgSw b8bGowOVasujx6KyzOEVgE275W3rrhzRjxgqsQDx5k5RY19dYOoa3zHeIwencSdSkzWi OYJ8d1lEoV1cxN+UN59coVEi760Q/1S9SL5LQWxh72ysxqfp6qnbF8XCdYeMtz18Jrkw iFR4JXTDOcDr/iMHaHkSwU3r5dE26Glf0W0ZqPEEg7PMMsne4yZFOqVQoPxQD5yJCSnE 9H7A== X-Forwarded-Encrypted: i=1; AJvYcCUTK051RB8453V8JH8f0Y4UwvcrdUNgXxy6dRlBhITp0qaDSkUEKovFApDRE9qSA7N6FRJqh4blnS/sEyer@lists.postgresql.org X-Gm-Message-State: AOJu0YwpFX8MAelwt0rNhFlkM44RGHnvJ578XUVPsf1CvnayopR5+KTj Wp51zJIZTmdutWi3uTC2JQcx/cNvKRNFHGONtB3f7RUOjy1Ojac/VoaskNaAI+MzzKHJy+SuTTT Jhw1xE0HxQW7o1OuS4KjZnmJMQ0kb/xM= X-Gm-Gg: ATEYQzwxDh1u/7KbOHlnhnLrDYeANn58yN/A0LsHEGY+MqqcsSGJqKmAjGikj7Wudol eYV1fxnyOCf9mAu6xE/HLWV72h2hicNe6i+XJ9Z0RiPC83iPMW20sXRGZHR5UrAwVDdn1K8r3Uq Yn7shfXw8//VQDNxA5YFc2NIvSeL7CWnBerFQRdWk3N+6PoMbzK9itZGzvGMvjAqi3wO0oCxzko oRP0VAdGXmkT7l2dYYcxfhMmldIl+PQj50Ak0SIWO0mJQUdJbtdVvvIAXwyHAE/rwea05jCYcQ5 qeRL6cZN X-Received: by 2002:a2e:8692:0:b0:381:800:910d with SMTP id 38308e7fff4ca-389ff35783cmr23607771fa.38.1772243844913; Fri, 27 Feb 2026 17:57:24 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Masahiko Sawada Date: Fri, 27 Feb 2026 17:56:48 -0800 X-Gm-Features: AaiRm52RQAfdLnzDSjM4rZ-fxyuJAxZX1USYFNQpGA4BBVmSwVOOMzJsAbWuAE8 Message-ID: Subject: Re: POC: Parallel processing of indexes in autovacuum To: Daniil Davydov <3danissimo@gmail.com> Cc: Sami Imseih , Alexander Korotkov , Matheus Alcantara , Maxim Orlov , Postgres hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Feb 27, 2026 at 5:49=E2=80=AFAM Daniil Davydov <3danissimo@gmail.co= m> wrote: > > Hi, > > On Thu, Feb 26, 2026 at 6:59=E2=80=AFAM Masahiko Sawada wrote: > > > > For example, if users want to disable all parallel queries, they can do > > that by setting max_parallel_workers to 0. If parallel vacuum workers > > for autovacuums are taken from max_worker_processes pool (i.e., > > without max_paralle_workers limit), users would need to set both > > max_parallel_workers and autovacuum_max_parallel_workers to 0. > > > > This is kinda off-topic already, but I really want to clarify this questi= on. > > If parallel a/v workers are not limited by max_parallel_workers and the > user wants to disable all parallel operations, it is still enough to set > max_parallel_workers to 0. In this case parallel a/v could not acquire an= y > workers from bgworkers pool, and thus the user's goal is reached (and the= re > is no need to set autovacuum_max_parallel_workers to 0). IIUC earlier patches defined autovacuum_max_parallel_workers with the limit by max_worker_processes. Suppose we set: - max_worker_processes =3D 8 - autovacuum_max_parallel_workers =3D 4 - max_parallel_workers =3D 4 If we want to disable all parallel operations, we would need to set max_parallel_workers to 0 as well as either autovacuum_max_parallel_workers to 0, no? This is because if we set only max_parallel_workers to 0, autovacuum workers still can take parallel vacuum workers from the max_worker_processes pool. I might be missing something though. > > **Comments on the 0003 patch** > > > > > +typedef struct CostParamsData > > +{ > > + double cost_delay; > > + int cost_limit; > > + int cost_page_dirty; > > + int cost_page_hit; > > + int cost_page_miss; > > +} CostParamsData; > > > > The name CostParamsData sounds too generic and I guess it could > > conflict with optimizer-related struct names in the future. How about > > renaming it to VacuumDelayParams? > > I agree with the idea to rename this structure. But maybe we should renam= e > it to "VacuumCostParams"? This name conveys the contents of the structure > better, because enabling these parameters is called "VacuumCostActive". +1 > > > + SpinLockAcquire(&pv_shared_cost_params->mutex); > > + > > + shared_params_data =3D pv_shared_cost_params->params_data; > > + > > + VacuumCostDelay =3D shared_params_data.cost_delay; > > + VacuumCostLimit =3D shared_params_data.cost_limit; > > + VacuumCostPageDirty =3D shared_params_data.cost_page_dirty; > > + VacuumCostPageHit =3D shared_params_data.cost_page_hit; > > + VacuumCostPageMiss =3D shared_params_data.cost_page_miss; > > + > > + SpinLockRelease(&pv_shared_cost_params->mutex); > > > > If we copy the shared values in pv_shared_cost_params, we should > > release the spinlock earlier, i.e., before updating VacuumCostXXX > > variables. But I don't think we would even need to set these values in > > the local variables in this case as updating 4 local variables is > > fairly cheap. > > > > Do you mean that we can release spinlock because we already copied the va= lues > from the shared state to the local variable "shared_params_data"? Yes. > I added this > variable as an alias for the long string "pv_shared_cost_params->params_d= ata" > and I guess that compiler will get rid of it. > > But now it doesn't seem like a good solution to me anymore. I'll get rid = of > the local variable and copy the values directly from the shared state > (under spinlock). Thanks. > > > > > How about renaming it to use_shared_delay_params? I think it convey= s > > > > better what the field is used for. > > > > > > I think that we should leave this name, because in the future some ot= her > > > behavior differences may occur between manual VACUUM and autovacuum. > > > If so, we will already have an "am_autovacuum" field which we can use= in > > > the code. > > > The existing logic with the "am_autovacuum" name is also LGTM - we sh= ould > > > use shared delay params only because we are running parallel autovacu= um. > > > > It may occur but we can change the field name when it really comes. > > > > I'm slightly concerned that we've been using am_xxx variables in a > > different way. For instance, am_walsender is a global variable that is > > set to true only in wal sender processes. Also we have a bunch of > > AmXXProcess() macros that checks the global variable MyBackendType, to > > check the kinds of the current process. That is, the subject of 'am' > > is typically the process, I guess. On the other hand, > > am_parallel_autovacuum is stored in DSM space and indicates whether a > > parallel vacuum is invoked by manual VACUUM or autovacuum. > > Yeah, I agree that "am_xxx" is not the best choice. > What about a simple "bool is_autovacuum"? +1 > > **Comments on the 0004 patch** > > > If we write the log "%d parallel autovacuum workers have been > > released" in AutoVacuumReleaseParallelWorkres(), can we simplify both > > tests (4 and 5) further? > > > > It won't help the 4th test, because ReleaseParallelWorkers is called > due to both ERROR and shmem_exit, but we want to be sure that > workers are released in the try/catch block (i.e. before the shmem_exit). We already call AutoVacuumReleaseAllParallelWorker() in the PG_CATCH() block in do_autovacuum(). If we write the log in AutoVacuumReleaseParallelWorkers(), the tap test is able to check the log, no? > Also, I don't know whether the 5th test needs this log at all, because in > the end we are checking the number of free parallel workers. If a killed > a/v leader doesn't release parallel workers, we'll notice it. If we can check the log written at process shutdown time, I think we can somewhat simplify the test 5 logic by not attaching 'autovacuum-start-parallel-vacuum' injection point. 1. attach 'autovacuum-leader-before-indexes-processing' injection point. 2. wait for an av worker to stop at the injection point. 3. terminate the av worker. 4. verify from the log if the workers have been released. 5. disable parallel autovacuum. 6. check the free workers (should be 10). Step 5 and 6 seems to be optional though. > > > + ereport(DEBUG2, errmsg("%s", buf.data)); > > > > Let's use elog() instead of ereport(). > > > > I suppose this is suggested because we don't want to translate error > messages of DEBUG level. Did I understand you correctly? We use ereport() for DEBUG level messages in many places actually. I suggested it because this message is not a user-facing message. > Please, see updated set of patches and diffs between v21 and v22. Thank you for updating the patches! Here are review comments on the v22 patch set. * 0001 patch: + /* + * Max number of parallel autovacuum workers. If value is 0 then parall= el + * degree will computed based on number of indexes. + */ + int autovacuum_parallel_workers; I'm a bit concerned that the above description doesn't explain what number of parallel vacuum workers are used in >0 as it mentioned only the maximum number. How about rewording it to: Target number of parallel autovacuum workers. -1 by default disables parallel vacuum during autovacuum. 0 means choose the parallel degree based on the number of indexes. * 0002 patch: + PVWorkersUsage workers_usage; /* Counters that follow are only for scanned_pages */ int64 tuples_deleted; /* # deleted from table */ Let's insert a new line between the new line and the existing line. --- + + if (AmAutoVacuumWorkerProcess()) + { + /* Worker usage stats for parallel autovacuum. */ + appendStringInfo(&buf, + _("parallel workers: index vacuum: %d planned, %d reserved, %d launched in total\n"), + vacrel->workers_usage.vacuum.nplanned, + vacrel->workers_usage.vacuum.nreserved= , + vacrel->workers_usage.vacuum.nlaunched= ); + } + else + { + /* Worker usage stats for manual VACUUM (PARALLEL). */ + appendStringInfo(&buf, + _("parallel workers: index vacuum: %d planned, %d launched in total\n"), + vacrel->workers_usage.vacuum.nplanned, + vacrel->workers_usage.vacuum.nlaunched= ); + } + } These comments are very obvious so I don't think we need them. Instead, I think it would be good to explain why we don't need to report "reserved" numbers in the manual vacuum cases. --- + if (vacrel->workers_usage.vacuum.nplanned > 0) + { + /* Stats for vacuum phase of index vacuuming. */ and + if (vacrel->workers_usage.cleanup.nplanned > 0) + { + /* Stats for cleanup phase of index vacuuming. */ + I don't think we need these comments (the second one has a typo though) as it's obvious. --- */ void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples, - int num_index_scans) + int num_index_scans, PVWorkersUsage *wu= sage) Please add a brief description of wusage to the function comment. We can add comments to both parallel_vacuum_bulkldel_all_indexes() and parallel_vacuum_cleanup_all_indexes() or only parallel_vacuum_process_all_indexes(). --- @@ -2070,6 +2070,8 @@ PVIndStats PVIndVacStatus PVOID PVShared +PVWorkersUsage +PVWorkersStats PX_Alias PX_Cipher PX_Combo @@ -2408,6 +2410,7 @@ PullFilterOps PushFilter PushFilterOps PushFunction +PVWorkersUsage PyCFunction PyMethodDef PyModuleDef PVWorkersUsage is added twice * 0003 patch: +#define VacCostParamsEquals(params) \ + (vacuum_cost_delay =3D=3D (params).cost_delay && \ + vacuum_cost_limit =3D=3D (params).cost_limit && \ + VacuumCostPageDirty =3D=3D (params).cost_page_dirty && \ + VacuumCostPageHit =3D=3D (params).cost_page_hit && \ + VacuumCostPageMiss =3D=3D (params).cost_page_miss) I'm not sure this macro helps reduce lines of code or improve readability as it's used only once and it's slightly unnatural to me that *Equals macro takes only one argument. * 0004 patch: +#include "commands/vacuum.h" +#include "fmgr.h" +#include "miscadmin.h" +#include "postmaster/autovacuum.h" +#include "storage/shmem.h" +#include "storage/ipc.h" +#include "storage/lwlock.h" +#include "utils/builtins.h" +#include "utils/injection_point.h" We can remove some unnecessary header includes. ISTM we need only fmgr.h, autovacuum.h, and injection_point.h. --- + const char *msg_format =3D + _("Parallel autovacuum worker cost params: cost_limit=3D%d, cost_delay=3D%g, cost_page_miss=3D%d, cost_page_dirty=3D%d, cost_page_hit=3D%d"); + I don't think we need the translation for this message as it's not a user-facing one. We don't capitalize the first letter in the error message. --- + ereport(DEBUG2, + (errmsg("number of free parallel autovacuum workers is set to %u due to config reload", + AutoVacuumShmem->av_freeParallelWorkers), + errhidecontext(true))); Why do we need to add errhidecontext(true) here? --- + 'tests': [ + 't/001_basic.pl', + ], Need to be updated to the new filename. --- + * Copyright (c) 2020-2025, PostgreSQL Global Development Group Please update the copyright years. Regards, --=20 Masahiko Sawada Amazon Web Services: https://aws.amazon.com