public inbox for [email protected]
help / color / mirror / Atom feedRe: POC: Parallel processing of indexes in autovacuum
112+ messages / 8 participants
[nested] [flat]
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-01 01:02 Masahiko Sawada <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Masahiko Sawada @ 2025-05-01 01:02 UTC (permalink / raw)
To: Maxim Orlov <[email protected]>; +Cc: Postgres hackers <[email protected]>
Hi,
On Wed, Apr 16, 2025 at 4:05 AM Maxim Orlov <[email protected]> wrote:
>
> Hi!
>
> The VACUUM command can be executed with the parallel option. As documentation states, it will perform index vacuum and index cleanup phases of VACUUM in parallel using integer background workers. But such an interesting feature is not used for an autovacuum. After a quick look at the source codes, it became clear to me that when the parallel option was added, the corresponding option for autovacuum wasn't implemented, although there are no clear obstacles to this.
>
> Actually, one of our customers step into a problem with autovacuum on a table with many indexes and relatively long transactions. Of course, long transactions are an ultimate evil and the problem can be solved by calling running vacuum and a cron task, but, I think, we can do better.
>
> Anyhow, what about adding parallel option for an autovacuum? Here is a POC patch for proposed functionality. For the sake of simplicity's, several GUC's have been added. It would be good to think through the parallel launch condition without them.
>
> As always, any thoughts and opinions are very welcome!
As I understand it, we initially disabled parallel vacuum for
autovacuum because their objectives are somewhat contradictory.
Parallel vacuum aims to accelerate the process by utilizing additional
resources, while autovacuum is designed to perform cleaning operations
with minimal impact on foreground transaction processing (e.g.,
through vacuum delay).
Nevertheless, I see your point about the potential benefits of using
parallel vacuum within autovacuum in specific scenarios. The crucial
consideration is determining appropriate criteria for triggering
parallel vacuum in autovacuum. Given that we currently support only
parallel index processing, suitable candidates might be autovacuum
operations on large tables that have a substantial number of
sufficiently large indexes and a high volume of garbage tuples.
Once we have parallel heap vacuum, as discussed in thread[1], it would
also likely be beneficial to incorporate it into autovacuum during
aggressive vacuum or failsafe mode.
Although the actual number of parallel workers ultimately depends on
the number of eligible indexes, it might be beneficial to introduce a
storage parameter, say parallel_vacuum_workers, that allows control
over the number of parallel vacuum workers on a per-table basis.
Regarding implementation: I notice the WIP patch implements its own
parallel vacuum mechanism for autovacuum. Have you considered simply
setting at_params.nworkers to a value greater than zero?
Regards,
[1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.g...
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-02 16:58 Sami Imseih <[email protected]>
parent: Masahiko Sawada <[email protected]>
1 sibling, 2 replies; 112+ messages in thread
From: Sami Imseih @ 2025-05-02 16:58 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Thanks for raising this idea!
I am generally -1 on the idea of autovacuum performing parallel
index vacuum, because I always felt that the parallel option should
be employed in a targeted manner for a specific table. if you have a bunch
of large tables, some more important than others, a/c may end
up using parallel resources on the least important tables and you
will have to adjust a/v settings per table, etc to get the right table
to be parallel index vacuumed by a/v.
Also, with the TIDStore improvements for index cleanup, and the practical
elimination of multi-pass index vacuums, I see this being even less
convincing as something to add to a/v.
Now, If I am going to allocate extra workers to run vacuum in parallel, why
not just provide more autovacuum workers instead so I can get more tables
vacuumed within a span of time?
> Once we have parallel heap vacuum, as discussed in thread[1], it would
> also likely be beneficial to incorporate it into autovacuum during
> aggressive vacuum or failsafe mode.
IIRC, index cleanup is disabled by failsafe.
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-02 18:12 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-05-02 18:12 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <[email protected]> wrote:
>
> As I understand it, we initially disabled parallel vacuum for
> autovacuum because their objectives are somewhat contradictory.
> Parallel vacuum aims to accelerate the process by utilizing additional
> resources, while autovacuum is designed to perform cleaning operations
> with minimal impact on foreground transaction processing (e.g.,
> through vacuum delay).
>
Yep, we also decided that we must not create more a/v workers for
index processing.
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).
> Nevertheless, I see your point about the potential benefits of using
> parallel vacuum within autovacuum in specific scenarios. The crucial
> consideration is determining appropriate criteria for triggering
> parallel vacuum in autovacuum. Given that we currently support only
> parallel index processing, suitable candidates might be autovacuum
> operations on large tables that have a substantial number of
> sufficiently large indexes and a high volume of garbage tuples.
>
> Although the actual number of parallel workers ultimately depends on
> the number of eligible indexes, it might be beneficial to introduce a
> storage parameter, say parallel_vacuum_workers, that allows control
> over the number of parallel vacuum workers on a per-table basis.
>
For now, we have three GUC variables for this purpose:
max_parallel_index_autovac_workers, autovac_idx_parallel_min_rows,
autovac_idx_parallel_min_indexes.
That is, everything is as you said. But we are still conducting
research on this issue. I would like to get rid of some of these
parameters.
> Regarding implementation: I notice the WIP patch implements its own
> parallel vacuum mechanism for autovacuum. Have you considered simply
> setting at_params.nworkers to a value greater than zero?
>
About `at_params.nworkers = N` - that's exactly what we're doing (you
can see it in the `vacuum_rel` function). But we cannot fully reuse
code of VACUUM PARALLEL, because it creates its own processes via
dynamic bgworkers machinery.
As I said above - we don't want to consume additional resources. Also
we don't want to complicate communication between processes (the idea
is that a/v workers can only send signals to the a/v launcher).
As a result, we created our own implementation of parallel index
processing control - see changes in vacuumparallel.c and autovacuum.c.
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-02 18:49 Daniil Davydov <[email protected]>
parent: Sami Imseih <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-05-02 18:49 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Fri, May 2, 2025 at 11:58 PM Sami Imseih <[email protected]> wrote:
>
> I am generally -1 on the idea of autovacuum performing parallel
> index vacuum, because I always felt that the parallel option should
> be employed in a targeted manner for a specific table. if you have a bunch
> of large tables, some more important than others, a/c may end
> up using parallel resources on the least important tables and you
> will have to adjust a/v settings per table, etc to get the right table
> to be parallel index vacuumed by a/v.
Hm, this is a good point. I think I should clarify one moment - in
practice, there is a common situation when users have one huge table
among all databases (with 80+ indexes created on it). But, of course,
in general there may be few such tables.
But we can still adjust the autovac_idx_parallel_min_rows parameter.
If a table has a lot of dead tuples => it is actively used => table is
important (?).
Also, if the user can really determine the "importance" of each of the
tables - we can provide an appropriate table option. Tables with this
option set will be processed in parallel in priority order. What do
you think about such an idea?
>
> Also, with the TIDStore improvements for index cleanup, and the practical
> elimination of multi-pass index vacuums, I see this being even less
> convincing as something to add to a/v.
If I understood correctly, then we are talking about the fact that
TIDStore can store so many tuples that in fact a second pass is never
needed.
But the number of passes does not affect the presented optimization in
any way. We must think about a large number of indexes that must be
processed. Even within a single pass we can have a 40% increase in
speed.
>
> Now, If I am going to allocate extra workers to run vacuum in parallel, why
> not just provide more autovacuum workers instead so I can get more tables
> vacuumed within a span of time?
For now, only one process can clean up indexes, so I don't see how
increasing the number of a/v workers will help in the situation that I
mentioned above.
Also, we don't consume additional resources during autovacuum in this
patch - total number of a/v workers always <= autovacuum_max_workers.
BTW, see v2 patch, attached to this letter (bug fixes) :-)
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v2-0001-WIP-Allow-autovacuum-to-process-indexes-of-single.patch (61.8K, 2-v2-0001-WIP-Allow-autovacuum-to-process-indexes-of-single.patch)
download | inline diff:
From 1c93a729b844a1dfe109e8d9e54d5cc0a941d061 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 3 May 2025 00:27:45 +0700
Subject: [PATCH v2] WIP Allow autovacuum to process indexes of single table in
parallel
---
src/backend/commands/vacuum.c | 27 +
src/backend/commands/vacuumparallel.c | 289 +++++-
src/backend/postmaster/autovacuum.c | 906 +++++++++++++++++-
src/backend/utils/misc/guc_tables.c | 30 +
src/backend/utils/misc/postgresql.conf.sample | 6 +
src/include/postmaster/autovacuum.h | 23 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 +
.../autovacuum/t/001_autovac_parallel.pl | 137 +++
9 files changed, 1387 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..a5ef5319ccc 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2234,6 +2234,33 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * Decide whether we need to process table with given oid in parallel mode
+ * during autovacuum.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED)
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= autovac_idx_parallel_min_rows)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (num_indexes >= autovac_idx_parallel_min_indexes &&
+ max_parallel_index_autovac_workers > 0)
+ {
+ params->nworkers = max_parallel_index_autovac_workers;
+ }
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..cb4b7c23010 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,20 +1,23 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
- * multiple passes of index bulk-deletion and index cleanup.
+ * multiple passes of index bulk-deletion and index cleanup. For maintenance
+ * vacuum, we launch workers manually (using dynamic bgworkers machinery), and
+ * for autovacuum we send signals to the autovacuum launcher (all logic for
+ * communication among parallel autovacuum processes is in autovacuum.c).
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -34,9 +37,11 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
+#include "utils/memutils.h"
#include "utils/rel.h"
/*
@@ -157,11 +162,20 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
- /* NULL for worker processes */
+ /* Is this structure used for maintenance vacuum or autovacuum */
+ bool is_autovacuum;
+
+ /*
+ * NULL for worker processes.
+ *
+ * NOTE: Parallel autovacuum only needs a subset of the maintenance vacuum
+ * functionality.
+ */
ParallelContext *pcxt;
/* Parent Heap Relation */
@@ -221,6 +235,10 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static ParallelContext *CreateParallelAutoVacContext(int nworkers);
+static void InitializeParallelAutoVacDSM(ParallelContext *pcxt);
+static void DestroyParallelAutoVacContext(ParallelContext *pcxt);
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -280,15 +298,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
}
pvs = (ParallelVacuumState *) palloc0(sizeof(ParallelVacuumState));
+ pvs->is_autovacuum = AmAutoVacuumWorkerProcess();
pvs->indrels = indrels;
pvs->nindexes = nindexes;
pvs->will_parallel_vacuum = will_parallel_vacuum;
pvs->bstrategy = bstrategy;
pvs->heaprel = rel;
- EnterParallelMode();
- pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
- parallel_workers);
+ if (pvs->is_autovacuum)
+ pcxt = CreateParallelAutoVacContext(parallel_workers);
+ else
+ {
+ EnterParallelMode();
+ pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+ parallel_workers);
+ }
Assert(pcxt->nworkers > 0);
pvs->pcxt = pcxt;
@@ -327,7 +351,10 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
else
querylen = 0; /* keep compiler quiet */
- InitializeParallelDSM(pcxt);
+ if (pvs->is_autovacuum)
+ InitializeParallelAutoVacDSM(pvs->pcxt);
+ else
+ InitializeParallelDSM(pcxt);
/* Prepare index vacuum stats */
indstats = (PVIndStats *) shm_toc_allocate(pcxt->toc, est_indstats_len);
@@ -371,11 +398,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
- shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
+
+ if (pvs->is_autovacuum)
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+ shared->dead_items_info.max_bytes = vac_work_mem * 1024L;
/* Prepare DSA space for dead items */
dead_items = TidStoreCreateShared(shared->dead_items_info.max_bytes,
@@ -453,8 +487,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
- DestroyParallelContext(pvs->pcxt);
- ExitParallelMode();
+ if (pvs->is_autovacuum)
+ DestroyParallelAutoVacContext(pvs->pcxt);
+ else
+ {
+ DestroyParallelContext((ParallelContext *) pvs->pcxt);
+ ExitParallelMode();
+ }
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
@@ -532,6 +571,144 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
}
+/*
+ * Short version of CreateParallelContext (parallel.c). Here we init only those
+ * fields that are needed for parallel index processing during autovacuum.
+ */
+static ParallelContext *
+CreateParallelAutoVacContext(int nworkers)
+{
+ ParallelContext *pcxt;
+ MemoryContext oldcontext;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Number of workers should be non-negative. */
+ Assert(nworkers >= 0);
+
+ /* We might be running in a short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Initialize a new ParallelContext. */
+ pcxt = palloc0(sizeof(ParallelContext));
+ pcxt->nworkers = nworkers;
+ pcxt->nworkers_to_launch = nworkers;
+ shm_toc_initialize_estimator(&pcxt->estimator);
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+
+ return pcxt;
+}
+
+/*
+ * Short version of InitializeParallelDSM (parallel.c). Here we put into dsm
+ * only those data that are needed for parallel index processing during
+ * autovacuum.
+ */
+static void
+InitializeParallelAutoVacDSM(ParallelContext *pcxt)
+{
+ MemoryContext oldcontext;
+ Size tsnaplen = 0;
+ Size asnaplen = 0;
+ Size segsize = 0;
+ char *tsnapspace;
+ char *asnapspace;
+ Snapshot transaction_snapshot = GetTransactionSnapshot();
+ Snapshot active_snapshot = GetActiveSnapshot();
+
+ Assert(pcxt->nworkers >= 1);
+
+ /* We might be running in a very short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnaplen = EstimateSnapshotSpace(transaction_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, tsnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+ }
+ asnaplen = EstimateSnapshotSpace(active_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, asnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+
+ /* Create DSM and initialize with new table of contents. */
+ segsize = shm_toc_estimate(&pcxt->estimator);
+ pcxt->seg = dsm_create(segsize, DSM_CREATE_NULL_IF_MAXSEGMENTS);
+
+ if (pcxt->seg == NULL)
+ {
+ pcxt->nworkers = 0;
+ pcxt->private_memory = MemoryContextAlloc(TopMemoryContext, segsize);
+ }
+
+ pcxt->toc = shm_toc_create(AV_PARALLEL_MAGIC,
+ pcxt->seg == NULL ? pcxt->private_memory :
+ dsm_segment_address(pcxt->seg),
+ segsize);
+
+ /* We can skip the rest of this if we're not budgeting for any workers. */
+ if (pcxt->nworkers > 0)
+ {
+ /*
+ * Serialize the transaction snapshot if the transaction isolation
+ * level uses a transaction snapshot.
+ */
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnapspace = shm_toc_allocate(pcxt->toc, tsnaplen);
+ SerializeSnapshot(transaction_snapshot, tsnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT,
+ tsnapspace);
+ }
+
+ /* Serialize the active snapshot. */
+ asnapspace = shm_toc_allocate(pcxt->toc, asnaplen);
+ SerializeSnapshot(active_snapshot, asnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, asnapspace);
+ }
+
+ /* Update nworkers_to_launch, in case we changed nworkers above. */
+ pcxt->nworkers_to_launch = pcxt->nworkers;
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Short version of DestroyParallelContext (parallel.c). Here we clean up only
+ * those data that were used during parallel index processing during autovacuum.
+ */
+static void
+DestroyParallelAutoVacContext(ParallelContext *pcxt)
+{
+ /*
+ * If we have allocated a shared memory segment, detach it. This will
+ * implicitly detach the error queues, and any other shared memory queues,
+ * stored there.
+ */
+ if (pcxt->seg != NULL)
+ {
+ dsm_detach(pcxt->seg);
+ pcxt->seg = NULL;
+ }
+
+ /*
+ * If this parallel context is actually in backend-private memory rather
+ * than shared memory, free that memory instead.
+ */
+ if (pcxt->private_memory != NULL)
+ {
+ pfree(pcxt->private_memory);
+ pcxt->private_memory = NULL;
+ }
+
+ AutoVacuumReleaseParallelWork(false);
+ pfree(pcxt);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -558,7 +735,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_index_autovac_workers == 0 && AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +776,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, max_parallel_index_autovac_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -670,7 +851,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
- if (num_index_scans > 0)
+ if (num_index_scans > 0 && !pvs->is_autovacuum)
ReinitializeParallelDSM(pvs->pcxt);
/*
@@ -686,9 +867,22 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
* The number of workers can vary between bulkdelete and cleanup
* phase.
*/
- ReinitializeParallelWorkers(pvs->pcxt, nworkers);
-
- LaunchParallelWorkers(pvs->pcxt);
+ if (pvs->is_autovacuum)
+ {
+ pvs->pcxt->nworkers_to_launch = Min(pvs->pcxt->nworkers, nworkers);
+ if (pvs->pcxt->nworkers > 0 && pvs->pcxt->nworkers_to_launch > 0)
+ {
+ pvs->pcxt->nworkers_launched =
+ LaunchParallelAutovacuumWorkers(pvs->heaprel->rd_id,
+ pvs->pcxt->nworkers_to_launch,
+ dsm_segment_handle(pvs->pcxt->seg));
+ }
+ }
+ else
+ {
+ ReinitializeParallelWorkers(pvs->pcxt, nworkers);
+ LaunchParallelWorkers(pvs->pcxt);
+ }
if (pvs->pcxt->nworkers_launched > 0)
{
@@ -733,8 +927,14 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
if (nworkers > 0)
{
- /* Wait for all vacuum workers to finish */
- WaitForParallelWorkersToFinish(pvs->pcxt);
+ /*
+ * Wait for all [auto]vacuum workers involved in parallel index
+ * processing (if any) to finish and advance state machine.
+ */
+ if (pvs->is_autovacuum && pvs->pcxt->nworkers_launched >= 0)
+ ParallelAutovacuumEndSyncPoint(false);
+ else if (!pvs->is_autovacuum)
+ WaitForParallelWorkersToFinish(pvs->pcxt);
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
@@ -982,8 +1182,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
@@ -997,23 +1197,22 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
BufferUsage *buffer_usage;
WalUsage *wal_usage;
int nindexes;
+ int worker_number;
char *sharedquery;
ErrorContextCallback errcallback;
- /*
- * A parallel vacuum worker must have only PROC_IN_VACUUM flag since we
- * don't support parallel vacuum for autovacuum as of now.
- */
- Assert(MyProc->statusFlags == PROC_IN_VACUUM);
-
- elog(DEBUG1, "starting parallel vacuum worker");
+ Assert(MyProc->statusFlags == PROC_IN_VACUUM || AmAutoVacuumWorkerProcess());
+ elog(DEBUG1, "starting parallel [auto]vacuum worker");
shared = (PVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
/* Set debug_query_string for individual workers */
- sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
- debug_query_string = sharedquery;
- pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+ debug_query_string = sharedquery;
+ pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ }
/* Track query ID */
pgstat_report_query_id(shared->queryid, false);
@@ -1091,8 +1290,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
- InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber],
- &wal_usage[ParallelWorkerNumber]);
+
+ worker_number = AmAutoVacuumWorkerProcess() ?
+ GetAutoVacuumParallelWorkerNumber() : ParallelWorkerNumber;
+
+ InstrEndParallelQuery(&buffer_usage[worker_number],
+ &wal_usage[worker_number]);
/* Report any remaining cost-based vacuum delay time */
if (track_cost_delay_timing)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..cb9c9f374bb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -90,6 +90,7 @@
#include "postmaster/postmaster.h"
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/lmgr.h"
@@ -102,6 +103,7 @@
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
+#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -129,6 +131,9 @@ int autovacuum_anl_thresh;
double autovacuum_anl_scale;
int autovacuum_freeze_max_age;
int autovacuum_multixact_freeze_max_age;
+int max_parallel_index_autovac_workers;
+int autovac_idx_parallel_min_rows;
+int autovac_idx_parallel_min_indexes;
double autovacuum_vac_cost_delay;
int autovacuum_vac_cost_limit;
@@ -164,6 +169,14 @@ static int default_freeze_table_age;
static int default_multixact_freeze_min_age;
static int default_multixact_freeze_table_age;
+/*
+ * Number of additional workers that was requested for parallel index processing
+ * during autovacuum.
+ */
+static int nworkers_for_idx_autovac = 0;
+
+static int nworkers_launched = 0;
+
/* Memory context for long-lived data */
static MemoryContext AutovacMemCxt;
@@ -222,6 +235,8 @@ typedef struct autovac_table
* wi_proc pointer to PGPROC of the running worker, NULL if not started
* wi_launchtime Time at which this worker was launched
* wi_dobalance Whether this worker should be included in balance calculations
+ * wi_pcleanup if (> 0) => this worker must participate in parallel index
+ * vacuuming as supportive . Must be (== 0) for leader worker.
*
* All fields are protected by AutovacuumLock, except for wi_tableoid and
* wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -237,10 +252,17 @@ typedef struct WorkerInfoData
TimestampTz wi_launchtime;
pg_atomic_flag wi_dobalance;
bool wi_sharedrel;
+ int wi_pcleanup;
} WorkerInfoData;
typedef struct WorkerInfoData *WorkerInfo;
+#define AmParallelIdxAutoVacSupportive() \
+ (MyWorkerInfo != NULL && MyWorkerInfo->wi_pcleanup > 0)
+
+#define AmParallelIdxAutoVacLeader() \
+ (MyWorkerInfo != NULL && MyWorkerInfo->wi_pcleanup == 0)
+
/*
* Possible signals received by the launcher from remote processes. These are
* stored atomically in shared memory so that other processes can set them
@@ -250,9 +272,11 @@ typedef enum
{
AutoVacForkFailed, /* failed trying to start a worker */
AutoVacRebalance, /* rebalance the cost limits */
+ AutoVacParallelReq, /* request for parallel index vacuum */
+ AutoVacNumSignals, /* must be last */
} AutoVacuumSignal;
-#define AutoVacNumSignals (AutoVacRebalance + 1)
+#define AutoVacNumSignals (AutoVacParallelReq + 1)
/*
* Autovacuum workitem array, stored in AutoVacuumShmem->av_workItems. This
@@ -272,6 +296,50 @@ typedef struct AutoVacuumWorkItem
#define NUM_WORKITEMS 256
+typedef enum
+{
+ LAUNCHER = 0, /* autovacuum launcher must wake everyone up */
+ LEADER, /* leader must wake everyone up */
+ LAST_WORKER, /* the last inited supportive worker must wake everyone
+ up */
+} SyncType;
+
+typedef enum
+{
+ STARTUP = 0, /* initial value - no sync points were passed */
+ START_SYNC_POINT_PASSED, /* start_sync_point was passed */
+ END_SYNC_POINT_PASSED, /* end_sync_point was passed */
+ SHUTDOWN, /* leader wants to shut down parallel index
+ vacuum due to occured error */
+} Status;
+
+/*
+ * Structure, stored in AutoVacuumShmem->pav_workItem. This is used for managing
+ * parallel index processing (whithin single table).
+ */
+typedef struct ParallelAutoVacuumWorkItem
+{
+ Oid avw_database;
+ Oid avw_relation;
+ int nworkers_participating;
+ int nworkers_to_launch;
+ int nworkers_sleeping; /* leader doesn't count */
+ int nfinished; /* # of workers, that already finished parallel
+ index processing (and probably already dead) */
+
+ dsm_handle handl;
+ int leader_proc_pid;
+
+ PGPROC *leader_proc;
+ ConditionVariable cv;
+
+ bool active; /* being processed */
+ bool leader_sleeping_on_ssp; /* sleeping on start sync point */
+ bool leader_sleeping_on_esp; /* sleeping on end sync point */
+ SyncType sync_type;
+ Status status;
+} ParallelAutoVacuumWorkItem;
+
/*-------------
* The main autovacuum shmem struct. On shared memory we store this main
* struct and the array of WorkerInfo structs. This struct keeps:
@@ -283,6 +351,8 @@ typedef struct AutoVacuumWorkItem
* av_startingWorker pointer to WorkerInfo currently being started (cleared by
* the worker itself as soon as it's up and running)
* av_workItems work item array
+ * pav_workItem information needed for parallel index processing whithing
+ * single table
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
*
@@ -298,6 +368,7 @@ typedef struct
dlist_head av_runningWorkers;
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+ ParallelAutoVacuumWorkItem pav_workItem;
pg_atomic_uint32 av_nworkersForBalance;
} AutoVacuumShmemStruct;
@@ -322,11 +393,17 @@ pg_noreturn static void AutoVacLauncherShutdown(void);
static void launcher_determine_sleep(bool canlaunch, bool recursing,
struct timeval *nap);
static void launch_worker(TimestampTz now);
+static void launch_worker_for_pcleanup(TimestampTz now);
+static void eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item,
+ bool all_launched);
static List *get_database_list(void);
static void rebuild_database_list(Oid newdb);
static int db_comparator(const void *a, const void *b);
static void autovac_recalculate_workers_for_balance(void);
+static int parallel_autovacuum_start_sync_point(bool keep_lock);
+static void handle_parallel_idx_autovac_errors(void);
+
static void do_autovacuum(void);
static void FreeWorkerInfo(int code, Datum arg);
@@ -355,6 +432,10 @@ static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+typedef bool (*wakeup_condition) (ParallelAutoVacuumWorkItem *item);
+static bool start_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item);
+static bool end_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item);
+static void CVSleep(ParallelAutoVacuumWorkItem *item, wakeup_condition wakeup_cond);
/********************************************************************
@@ -583,7 +664,14 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
* wakening conditions.
*/
- launcher_determine_sleep(av_worker_available(), false, &nap);
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ /* Take the smallest possible sleep interval. */
+ nap.tv_sec = 0;
+ nap.tv_usec = MIN_AUTOVAC_SLEEPTIME * 1000;
+ }
+ else
+ launcher_determine_sleep(av_worker_available(), false, &nap);
/*
* Wait until naptime expires or we get some type of signal (all the
@@ -614,6 +702,19 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
LWLockRelease(AutovacuumLock);
}
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_signal[AutoVacParallelReq])
+ {
+ ParallelAutoVacuumWorkItem *item;
+
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = false;
+
+ item = &AutoVacuumShmem->pav_workItem;
+ nworkers_for_idx_autovac = item->nworkers_to_launch;
+ nworkers_launched = 0;
+ }
+ LWLockRelease(AutovacuumLock);
+
if (AutoVacuumShmem->av_signal[AutoVacForkFailed])
{
/*
@@ -686,6 +787,7 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
worker->wi_sharedrel = false;
worker->wi_proc = NULL;
worker->wi_launchtime = 0;
+ worker->wi_pcleanup = -1;
dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker->wi_links);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -698,9 +800,29 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
}
LWLockRelease(AutovacuumLock); /* either shared or exclusive */
- /* if we can't do anything, just go back to sleep */
if (!can_launch)
+ {
+ /*
+ * If launcher cannot launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts and
+ * tell everyone, that there will no new supportive workers.
+ */
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ ParallelAutoVacuumWorkItem *item;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+ Assert(item->active);
+
+ eliminate_lock_conflicts(item, false);
+ nworkers_launched = nworkers_for_idx_autovac = 0;
+ LWLockRelease(AutovacuumLock);
+ }
+
+ /* if we can't do anything else, just go back to sleep */
continue;
+ }
/* We're OK to start a new worker */
@@ -716,6 +838,15 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
*/
launch_worker(current_time);
}
+ else if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ /*
+ * One of active autovacuum workers sent us request to lauch
+ * participants for parallel index vacuum. We check this case first
+ * because we need to start participants as soon as possible.
+ */
+ launch_worker_for_pcleanup(current_time);
+ }
else
{
/*
@@ -1267,6 +1398,7 @@ do_start_worker(void)
worker->wi_dboid = avdb->adw_datid;
worker->wi_proc = NULL;
worker->wi_launchtime = GetCurrentTimestamp();
+ worker->wi_pcleanup = -1;
AutoVacuumShmem->av_startingWorker = worker;
@@ -1349,6 +1481,136 @@ launch_worker(TimestampTz now)
}
}
+/*
+ * launch_worker_for_pcleanup
+ *
+ * Wrapper for starting a worker (requested by leader of parallel index
+ * vacuuming) from the launcher.
+ */
+static void
+launch_worker_for_pcleanup(TimestampTz now)
+{
+ ParallelAutoVacuumWorkItem *item;
+ WorkerInfo worker;
+ dlist_node *wptr;
+
+ Assert(nworkers_launched < nworkers_for_idx_autovac);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Get a worker entry from the freelist. We checked above, so there
+ * really should be a free slot.
+ */
+ wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+
+ worker = dlist_container(WorkerInfoData, wi_links, wptr);
+ worker->wi_dboid = InvalidOid;
+ worker->wi_proc = NULL;
+ worker->wi_launchtime = GetCurrentTimestamp();
+
+ /*
+ * Set indicator, that this workers must join to parallel index vacuum.
+ * This variable also plays the role of an unique id among parallel index
+ * vacuum workers. First id is '1', because '0' is reserved for leader.
+ */
+ worker->wi_pcleanup = (nworkers_launched + 1);
+
+ AutoVacuumShmem->av_startingWorker = worker;
+
+ SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
+
+ item = &AutoVacuumShmem->pav_workItem;
+ Assert(item->active);
+
+ nworkers_launched += 1;
+
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ LWLockRelease(AutovacuumLock);
+ return;
+ }
+
+ Assert(item->sync_type == LAUNCHER &&
+ nworkers_launched == nworkers_for_idx_autovac);
+
+ /*
+ * If launcher managed to launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts.
+ */
+ eliminate_lock_conflicts(item, true);
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Must be called from autovacuum launcher when it launched all requested
+ * workers for parallel index vacuum, or when it realized, that no more
+ * processes can be launched.
+ *
+ * In this function launcher will assign roles in such a way as to avoid lock
+ * conflicts between leader and supportive workers.
+ *
+ * AutovacuumLock must be held in exclusive mode before calling this function!
+ */
+static void
+eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item, bool all_launched)
+{
+ Assert(AmAutoVacuumLauncherProcess());
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ /* So, let's start... */
+
+ if (item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If both leader and all launched supportive workers are sleeping, then
+ * only we can wake everyone up.
+ */
+ ConditionVariableBroadcast(&item->cv);
+
+ /* Advance status. */
+ item->status = START_SYNC_POINT_PASSED;
+ }
+ else if (item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping < nworkers_launched)
+ {
+ /*
+ * If leader already sleeping, but several supportive workers are
+ * initing, we shift the responsibility for awakening everyone into the
+ * worker who completes initialization last
+ */
+ item->sync_type = LAST_WORKER;
+ }
+ else if (!item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If only leader is not sleeping - it must wake up all workers when it
+ * finishes all preparations.
+ */
+ item->sync_type = LEADER;
+ }
+ else
+ {
+ /*
+ * If nobody is sleeping, we assume that leader has higher chanses to
+ * asleep first, so set sync type to LAST_WORKER, but if the last worker
+ * will see that leader still not sleeping, it will change sync type to
+ * LEADER and asleep.
+ */
+ item->sync_type = LAST_WORKER;
+ }
+
+ /*
+ * If we cannot launch all requested workers, refresh
+ * nworkers_to_launch value, so that the last worker can find out
+ * that he is really the last.
+ */
+ if (!all_launched && item->sync_type == LAST_WORKER)
+ item->nworkers_to_launch = nworkers_launched;
+}
+
/*
* Called from postmaster to signal a failure to fork a process to become
* worker. The postmaster should kill(SIGUSR2) the launcher shortly
@@ -1360,6 +1622,37 @@ AutoVacWorkerFailed(void)
AutoVacuumShmem->av_signal[AutoVacForkFailed] = true;
}
+/*
+ * Called from autovacuum worker to signal that he needs participants in
+ * parallel index vacuum. Function sends SIGUSR2 to the launcher and returns
+ * 'true' iff signal was sent successfully.
+ */
+bool
+AutoVacParallelWorkRequest(void)
+{
+ if (AutoVacuumShmem->av_launcherpid == 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("autovacuum launcher is dead")));
+
+ return false;
+ }
+
+ if (kill(AutoVacuumShmem->av_launcherpid, SIGUSR2) < 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_SYSTEM_ERROR),
+ errmsg("failed to send signal to autovac launcher (pid %d): %m",
+ AutoVacuumShmem->av_launcherpid)));
+
+ return false;
+ }
+
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = true;
+ return true;
+}
+
/* SIGUSR2: a worker is up and running, or just finished, or failed to fork */
static void
avl_sigusr2_handler(SIGNAL_ARGS)
@@ -1559,6 +1852,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
{
char dbname[NAMEDATALEN];
+ Assert(MyWorkerInfo->wi_pcleanup < 0);
+
/*
* Report autovac startup to the cumulative stats system. We
* deliberately do this before InitPostgres, so that the
@@ -1593,12 +1888,113 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
+ else if (AmParallelIdxAutoVacSupportive())
+ {
+ ParallelAutoVacuumWorkItem *item;
+ dsm_handle handle;
+ PGPROC *leader_proc;
+ int leader_proc_pid;
+ dsm_segment *seg;
+ shm_toc *toc;
+ char *asnapspace;
+ char *tsnapspace;
+ char dbname[NAMEDATALEN];
+ Snapshot tsnapshot;
+ Snapshot asnapshot;
+
+ /*
+ * We will abort parallel index vacuuming whithin current process if
+ * something errors out
+ */
+ PG_TRY();
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+ dbid = item->avw_database;
+ handle = item->handl;
+ leader_proc = item->leader_proc;
+ leader_proc_pid = item->leader_proc_pid;
+ LWLockRelease(AutovacuumLock);
+
+ InitPostgres(NULL, dbid, NULL, InvalidOid,
+ INIT_PG_OVERRIDE_ALLOW_CONNS,
+ dbname);
+
+ set_ps_display(dbname);
+ if (PostAuthDelay)
+ pg_usleep(PostAuthDelay * 1000000L);
+
+ /* And do an appropriate amount of work */
+ recentXid = ReadNextTransactionId();
+ recentMulti = ReadNextMultiXactId();
+
+ if (parallel_autovacuum_start_sync_point(false) == -1)
+ {
+ /* We are not participating anymore */
+ MyWorkerInfo->wi_pcleanup = -1;
+ goto exit;
+ }
+
+ seg = dsm_attach(handle);
+ if (seg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not map dynamic shared memory segment")));
+
+ toc = shm_toc_attach(AV_PARALLEL_MAGIC, dsm_segment_address(seg));
+ if (toc == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("invalid magic number in dynamic shared memory segment")));
+
+ if (!BecomeLockGroupMember(leader_proc, leader_proc_pid))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not become lock group member")));
+ }
+
+ StartTransactionCommand();
+
+ asnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, false);
+ tsnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT, true);
+ asnapshot = RestoreSnapshot(asnapspace);
+ tsnapshot = tsnapspace ? RestoreSnapshot(tsnapspace) : asnapshot;
+ RestoreTransactionSnapshot(tsnapshot, leader_proc);
+ PushActiveSnapshot(asnapshot);
+
+ /*
+ * We've changed which tuples we can see, and must therefore
+ * invalidate system caches.
+ */
+ InvalidateSystemCaches();
+
+ parallel_vacuum_main(seg, toc);
+
+ /* Must pop active snapshot so snapmgr.c doesn't complain. */
+ PopActiveSnapshot();
+
+ dsm_detach(seg);
+ CommitTransactionCommand();
+ ParallelAutovacuumEndSyncPoint(false);
+ }
+ PG_CATCH();
+ {
+ EmitErrorReport();
+ if (AmParallelIdxAutoVacSupportive())
+ handle_parallel_idx_autovac_errors();
+ }
+ PG_END_TRY();
+ }
/*
* The launcher will be notified of my death in ProcKill, *if* we managed
* to get a worker slot at all
*/
+exit:
/* All done, go away */
proc_exit(0);
}
@@ -2461,6 +2857,10 @@ do_autovacuum(void)
tab->at_datname, tab->at_nspname, tab->at_relname);
EmitErrorReport();
+ /* if we are parallel index vacuuming leader, we must shut it down */
+ if (AmParallelIdxAutoVacLeader())
+ handle_parallel_idx_autovac_errors();
+
/* this resets ProcGlobal->statusFlags[i] too */
AbortOutOfAnyTransaction();
FlushErrorState();
@@ -3296,6 +3696,503 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Release work item, used for managing parallel index vacuum. Must be called
+ * once and only from leader worker.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+AutoVacuumReleaseParallelWork(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+
+ /*
+ * We might not get the workitem from launcher (we must not be considered
+ * as leader in this case), so just leave.
+ */
+ if (!AmParallelIdxAutoVacLeader())
+ return;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+
+ Assert(AmParallelIdxAutoVacLeader() &&
+ workitem->leader_proc_pid == MyProcPid);
+
+ workitem->leader_proc = NULL;
+ workitem->leader_proc_pid = 0;
+ workitem->active = false;
+
+ /* We are not leader anymore. */
+ MyWorkerInfo->wi_pcleanup = -1;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+}
+
+static bool
+start_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item)
+{
+ bool need_wakeup = false;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ /*
+ * In normal case we should exit sleep loop after last launched
+ * supportive worker passed sync point (status == START_SYNC_POINT_PASSED).
+ * But if we are in SHUTDOWN mode, all launched workers will just exit
+ * sync point whithout status advancing. We can handle such case if we
+ * check that n_participating == n_to_launch.
+ */
+ if (item->status == SHUTDOWN)
+ need_wakeup = (item->nworkers_participating == item->nworkers_to_launch);
+ else
+ need_wakeup = item->status == START_SYNC_POINT_PASSED;
+ }
+ else
+ need_wakeup = (item->status == START_SYNC_POINT_PASSED ||
+ item->status == SHUTDOWN);
+
+ LWLockRelease(AutovacuumLock);
+ return need_wakeup;
+}
+
+static bool
+end_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item)
+{
+ bool need_wakeup = false;
+
+ Assert(AmParallelIdxAutoVacLeader());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ need_wakeup = item->status == END_SYNC_POINT_PASSED;
+ LWLockRelease(AutovacuumLock);
+ return need_wakeup;
+}
+
+/*
+ * Waiting on condition variable is frequent operation, so it has beed taken
+ * out with a separate function. Caller must acquire hold AutovacuumLock before
+ * calling it.
+ */
+static void
+CVSleep(ParallelAutoVacuumWorkItem *item, wakeup_condition wakeup_cond)
+{
+ ConditionVariablePrepareToSleep(&item->cv);
+
+ LWLockRelease(AutovacuumLock);
+ PG_TRY();
+ {
+ do
+ {
+ ConditionVariableSleep(&item->cv, PG_WAIT_IPC);
+ } while (!wakeup_cond(item));
+ }
+ PG_CATCH();
+ {
+ ConditionVariableCancelSleep();
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+
+ ConditionVariableCancelSleep();
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+}
+
+/*
+ * This function used to synchronize leader with supportive workers during
+ * parallel index vacuuming. Each process will exit iff:
+ * Leader worker is ready to perform parallel vacuum &&
+ * All launched supportive workers are ready to perform parallel vacuum &&
+ * (Autovacuum launcher already launched all requested workers ||
+ * Autovacuum launcher cannot launch more workers)
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ *
+ * NOTE: Some workers may call this function when leader worker decided to shut
+ * down parallel vacuuming. In this case '-1' value will be returned.
+ */
+static int
+parallel_autovacuum_start_sync_point(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+ SyncType sync_type;
+ int num_participants;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+ Assert(workitem->active);
+ sync_type = workitem->sync_type;
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(workitem->leader_proc_pid == MyProcPid);
+
+ /* Wake up all sleeping supportive workers, if required ... */
+ if (sync_type == LEADER)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ * Don't advance if we call this function from error handle function
+ * (status == SHUTDOWN).
+ */
+ if (workitem->status != SHUTDOWN)
+ workitem->status = START_SYNC_POINT_PASSED;
+ }
+ /* ... otherwise, wait for somebody to wake us up */
+ else
+ {
+ workitem->leader_sleeping_on_ssp = true;
+ CVSleep(workitem, start_sync_point_wakeup_cond);
+ workitem->leader_sleeping_on_ssp = false;
+
+ /*
+ * A priori, we believe that in the end everyone should be awakened
+ * by the leader.
+ */
+ workitem->sync_type = LEADER;
+ }
+ }
+ else
+ {
+ workitem->nworkers_participating += 1;
+
+ /*
+ * If we know, that launcher will no longer attempt to launch more
+ * supportive workers for this item => we are LAST_WORKER for sure.
+ *
+ * Note, that launcher set LAST_WORKER sync type without knowing
+ * current status of leader. So we also check that leader is sleeping
+ * before wake all up. Otherwise, we must wait for leader (and ask him
+ * to wake all up).
+ */
+ if (workitem->nworkers_participating == workitem->nworkers_to_launch &&
+ sync_type == LAST_WORKER && workitem->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * We must not advance status if leader wants to shut down parallel
+ * execution (see checks below).
+ */
+ if (workitem->status != SHUTDOWN)
+ workitem->status = START_SYNC_POINT_PASSED;
+ }
+ else
+ {
+ if (workitem->nworkers_participating == workitem->nworkers_to_launch &&
+ sync_type == LAST_WORKER)
+ {
+ workitem->sync_type = LEADER;
+ }
+
+ workitem->nworkers_sleeping += 1;
+ CVSleep(workitem, start_sync_point_wakeup_cond);
+ workitem->nworkers_sleeping -= 1;
+ }
+ }
+
+ /* Tell caller that it must not participate in parallel index cleanup. */
+ if (workitem->status == SHUTDOWN)
+ num_participants = -1;
+ else
+ num_participants = workitem->nworkers_participating;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return num_participants;
+}
+
+/*
+ * Like function above, but must be called by leader and supportive workers
+ * when they finished parallel index vacuum.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+ParallelAutovacuumEndSyncPoint(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+ Assert(workitem->active);
+
+ if (workitem->nworkers_participating == 0)
+ {
+ Assert(!AmParallelIdxAutoVacSupportive());
+
+ /*
+ * We have two cases when no supportive workers were launched:
+ * 1) Leader got workitem, but launcher didn't launch any
+ * workers => just advance status, because we don't need to wait
+ * for anybody.
+ * 2) Leader didn't get workitem, because it was already in use =>
+ * we must not touch it. Just leave.
+ */
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(workitem->leader_proc_pid == MyProcPid);
+ workitem->status = END_SYNC_POINT_PASSED;
+ }
+ else
+ Assert(workitem->leader_proc_pid != MyProcPid);
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+ }
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(workitem->leader_proc_pid == MyProcPid);
+ Assert(workitem->sync_type == LEADER);
+
+ /* Wait for all workers to finish (only last worker will wake us up) */
+ if (workitem->nfinished != workitem->nworkers_participating)
+ {
+ workitem->sync_type = LAST_WORKER;
+ workitem->leader_sleeping_on_esp = true;
+ CVSleep(workitem, end_sync_point_wakeup_cond);
+ workitem->leader_sleeping_on_esp = false;
+
+ Assert(workitem->nfinished == workitem->nworkers_participating);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ */
+ workitem->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ else
+ {
+ workitem->nfinished += 1;
+
+ /* If we are last finished worker - wake up the leader.
+ *
+ * If not - just leave, because supportive worker already finished all
+ * work and must die.
+ */
+ if (workitem->sync_type == LAST_WORKER &&
+ workitem->nfinished == workitem->nworkers_participating &&
+ workitem->leader_sleeping_on_esp)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * Don't need to check SHUTDOWN status here - all supportive workers
+ * are about to finish anyway.
+ */
+ workitem->status = END_SYNC_POINT_PASSED;
+ }
+
+ /* We are not participate anymore */
+ MyWorkerInfo->wi_pcleanup = -1;
+ }
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+}
+
+/*
+ * Get id of parallel index vacuum worker (counting from 0).
+ */
+int
+GetAutoVacuumParallelWorkerNumber(void)
+{
+ Assert(AmAutoVacuumWorkerProcess() && MyWorkerInfo->wi_pcleanup > 0);
+ return (MyWorkerInfo->wi_pcleanup - 1);
+}
+
+/*
+ * Leader autovacuum process can decide, that he needs several helper workers
+ * to process table in parallel mode. He must set up parallel context and call
+ * LaunchParallelAutovacuumWorkers.
+ *
+ * In this function we do following :
+ * 1) Send signal to autovacuum lancher that creates 'supportive workers'
+ * during launcher's standard work loop.
+ * 2) Wait for supportive workers to start.
+ *
+ * Funcition return number of workers that launcher was able to launch (may be
+ * less then 'nworkers_to_launch').
+ */
+int
+LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle)
+{
+ int nworkers_launched = 0;
+ ParallelAutoVacuumWorkItem *workitem;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+
+ /*
+ * For now, there can be only one leader across all cluster.
+ * TODO: fix it in next versions
+ */
+ if (workitem->active && workitem->leader_proc_pid != MyProcPid)
+ {
+ LWLockRelease(AutovacuumLock);
+ return -1;
+ }
+
+ /* Notify autovacuum launcher that we need supportive workers */
+ if (AutoVacParallelWorkRequest())
+ {
+ /* OK, we can use this workitem entry. Init it. */
+ workitem->avw_database = MyDatabaseId;
+ workitem->avw_relation = rel_id;
+ workitem->handl = handle;
+ workitem->leader_proc = MyProc;
+ workitem->leader_proc_pid = MyProcPid;
+ workitem->nworkers_participating = 0;
+ workitem->nworkers_to_launch = nworkers_to_launch;
+ workitem->leader_sleeping_on_ssp = false;
+ workitem->leader_sleeping_on_esp = false;
+ workitem->nworkers_sleeping = 0;
+ workitem->nfinished = 0;
+ workitem->sync_type = LAUNCHER;
+ workitem->status = STARTUP;
+
+ workitem->active = true;
+ LWLockRelease(AutovacuumLock);
+
+ /* Become the leader */
+ MyWorkerInfo->wi_pcleanup = 0;
+
+ /* All created workers must get same locks as leader process */
+ BecomeLockGroupLeader();
+
+ /*
+ * Wait until all supprotive workers are launched. Also retrieve actual
+ * number of participants
+ */
+
+ nworkers_launched = parallel_autovacuum_start_sync_point(false);
+ Assert(nworkers_launched >= 0);
+ }
+ else
+ {
+ /*
+ * If we (for any reason) cannot send signal to the launcher, don't try
+ * to do index vacuuming in parallel
+ */
+ LWLockRelease(AutovacuumLock);
+ return 0;
+ }
+
+ return nworkers_launched;
+}
+
+/*
+ * During parallel index vacuuming any worker (both supportives and leader) can
+ * catch an error.
+ * In order to handle it in the right way we must call this function.
+ */
+static void
+handle_parallel_idx_autovac_errors(void)
+{
+ ParallelAutoVacuumWorkItem *item;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ if (item->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed - just wait for all supportive
+ * workers to finish and exit.
+ */
+ ParallelAutovacuumEndSyncPoint(true);
+ }
+ else if (item->status == STARTUP)
+ {
+ /*
+ * If no sync point are passed we can prevent supportive workers
+ * from performing their work - set SHUTDOWN status and wait while
+ * all workers will see it.
+ */
+ item->status = SHUTDOWN;
+ parallel_autovacuum_start_sync_point(true);
+ }
+
+ AutoVacuumReleaseParallelWork(true);
+ }
+ else
+ {
+ Assert(AmParallelIdxAutoVacSupportive());
+
+ if (item->status == STARTUP || item->status == SHUTDOWN)
+ {
+ /*
+ * If no sync point are passed - just exclude ourselves from
+ * participants. Further parallel index vacuuming will take place
+ * as usual.
+ */
+ item->nworkers_to_launch -= 1;
+
+ if (item->nworkers_participating == item->nworkers_to_launch &&
+ item->sync_type == LAST_WORKER && item->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&item->cv);
+
+ if (item->status != SHUTDOWN)
+ item->status = START_SYNC_POINT_PASSED;
+ }
+ }
+ else if (item->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed we will simulate the usual
+ * end of work (see ParallelAutovacuumEndSyncPoint).
+ */
+ item->nfinished += 1;
+
+ /*
+ * We check "!item->leader_sleeping_on_ssp" in order to handle an
+ * almost impossible situation, when leader didn't have time to wake
+ * up after start sync point (but last worker already advenced
+ * status to START_SYNC_POINT_PASSED). In this case we should not
+ * advance status to END_SYNC_POINT_PASSED, so leader can continue
+ * processing.
+ */
+ if (item->sync_type == LAST_WORKER &&
+ item->nfinished == item->nworkers_participating &&
+ !item->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&item->cv);
+ item->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3361,6 +4258,9 @@ AutoVacuumShmemInit(void)
AutoVacuumShmem->av_startingWorker = NULL;
memset(AutoVacuumShmem->av_workItems, 0,
sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
+ memset(&AutoVacuumShmem->pav_workItem, 0,
+ sizeof(ParallelAutoVacuumWorkItem));
+ ConditionVariableInit(&AutoVacuumShmem->pav_workItem.cv);
worker = (WorkerInfo) ((char *) AutoVacuumShmem +
MAXALIGN(sizeof(AutoVacuumShmemStruct)));
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..2e36921097a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3647,6 +3647,36 @@ struct config_int ConfigureNamesInt[] =
check_autovacuum_work_mem, NULL, NULL
},
+ {
+ {"max_parallel_index_autovac_workers", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the maximum number of parallel autovacuum worker processes during parallel index vacuuming of single table."),
+ NULL
+ },
+ &max_parallel_index_autovac_workers,
+ 0, 0, MAX_PARALLEL_WORKER_LIMIT,
+ NULL, NULL, NULL
+ },
+
+ {
+ {"autovac_idx_parallel_min_rows", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the minimum number of dead tuples in single table that requires parallel index processing during autovacuum."),
+ NULL
+ },
+ &autovac_idx_parallel_min_rows,
+ 0, 0, INT32_MAX,
+ NULL, NULL, NULL
+ },
+
+ {
+ {"autovac_idx_parallel_min_indexes", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the minimum number indexes created on single table that requires parallel index processing during autovacuum."),
+ NULL
+ },
+ &autovac_idx_parallel_min_indexes,
+ 2, 2, INT32_MAX,
+ NULL, NULL, NULL
+ },
+
{
{"tcp_keepalives_idle", PGC_USERSET, CONN_AUTH_TCP,
gettext_noop("Time between issuing TCP keepalives."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..08869398039 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -146,6 +146,12 @@
#hash_mem_multiplier = 2.0 # 1-1000.0 multiplier on hash table work_mem
#maintenance_work_mem = 64MB # min 64kB
#autovacuum_work_mem = -1 # min 64kB, or -1 to use maintenance_work_mem
+#max_parallel_index_autovac_workers = 0 # this feature disabled by default
+ # (change requires restart)
+#autovac_idx_parallel_min_rows = 0
+ # (change requires restart)
+#autovac_idx_parallel_min_indexes = 2
+ # (change requires restart)
#logical_decoding_work_mem = 64MB # min 64kB
#max_stack_depth = 2MB # min 100kB
#shared_memory_type = mmap # the default is the first option
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..8647154437b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -15,6 +15,8 @@
#define AUTOVACUUM_H
#include "storage/block.h"
+#include "storage/dsm_impl.h"
+#include "storage/lock.h"
/*
* Other processes can request specific work from autovacuum, identified by
@@ -25,12 +27,25 @@ typedef enum
AVW_BRINSummarizeRange,
} AutoVacuumWorkItemType;
+/*
+ * Magic number for parallel context TOC. Used for parallel index processing
+ * during autovacuum.
+ */
+#define AV_PARALLEL_MAGIC 0xaaaaaaaa
+
+/* Magic numbers for per-context parallel index processing state sharing. */
+#define AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT UINT64CONST(0xFFF0000000000001)
+#define AV_PARALLEL_KEY_ACTIVE_SNAPSHOT UINT64CONST(0xFFF0000000000002)
+
/* GUC variables */
extern PGDLLIMPORT bool autovacuum_start_daemon;
extern PGDLLIMPORT int autovacuum_worker_slots;
extern PGDLLIMPORT int autovacuum_max_workers;
extern PGDLLIMPORT int autovacuum_work_mem;
+extern PGDLLIMPORT int max_parallel_index_autovac_workers;
+extern PGDLLIMPORT int autovac_idx_parallel_min_rows;
+extern PGDLLIMPORT int autovac_idx_parallel_min_indexes;
extern PGDLLIMPORT int autovacuum_naptime;
extern PGDLLIMPORT int autovacuum_vac_thresh;
extern PGDLLIMPORT int autovacuum_vac_max_thresh;
@@ -60,10 +75,18 @@ extern void AutoVacWorkerFailed(void);
pg_noreturn extern void AutoVacLauncherMain(const void *startup_data, size_t startup_data_len);
pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t startup_data_len);
+/* called from autovac worker when it needs participants in parallel index cleanup */
+extern bool AutoVacParallelWorkRequest(void);
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+extern void AutoVacuumReleaseParallelWork(bool keep_lock);
+extern int AutoVacuumParallelWorkWaitForStart(void);
+extern void ParallelAutovacuumEndSyncPoint( bool keep_lock);
+extern int GetAutoVacuumParallelWorkerNumber(void);
+extern int LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..ff07c33d867
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,137 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 1_000_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ );
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+my $dead_tuples_thresh = $initial_rows_num / 4;
+my $indexes_num_thresh = $indexes_num / 2;
+my $num_workers = 3;
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_work_mem = 2048
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum_max_workers = 10
+ autovacuum = on
+ autovac_idx_parallel_min_rows = $dead_tuples_thresh
+ autovac_idx_parallel_min_indexes = $indexes_num_thresh
+ max_parallel_index_autovac_workers = $num_workers
+});
+
+$node->restart;
+
+# wait for autovacuum to reset datfrozenxid age to 0
+$node->poll_query_until('postgres', q{
+ SELECT count(*) = 0 FROM pg_database WHERE mxid_age(datfrozenxid) > 0
+}) or die "Timed out while waiting for autovacuum";
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-02 20:17 Sami Imseih <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Sami Imseih @ 2025-05-02 20:17 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
> On Fri, May 2, 2025 at 11:58 PM Sami Imseih <[email protected]> wrote:
> >
> > I am generally -1 on the idea of autovacuum performing parallel
> > index vacuum, because I always felt that the parallel option should
> > be employed in a targeted manner for a specific table. if you have a bunch
> > of large tables, some more important than others, a/c may end
> > up using parallel resources on the least important tables and you
> > will have to adjust a/v settings per table, etc to get the right table
> > to be parallel index vacuumed by a/v.
>
> Hm, this is a good point. I think I should clarify one moment - in
> practice, there is a common situation when users have one huge table
> among all databases (with 80+ indexes created on it). But, of course,
> in general there may be few such tables.
> But we can still adjust the autovac_idx_parallel_min_rows parameter.
> If a table has a lot of dead tuples => it is actively used => table is
> important (?).
> Also, if the user can really determine the "importance" of each of the
> tables - we can provide an appropriate table option. Tables with this
> option set will be processed in parallel in priority order. What do
> you think about such an idea?
I think in most cases, the user will want to determine the priority of
a table getting parallel vacuum cycles rather than having the autovacuum
determine the priority. I also see users wanting to stagger
vacuums of large tables with many indexes through some time period,
and give the
tables the full amount of parallel workers they can afford at these
specific periods
of time. A/V currently does not really allow for this type of
scheduling, and if we
give some kind of GUC to prioritize tables, I think users will constantly have
to be modifying this priority.
I am basing my comments on the scenarios I have seen on the field, and others
may have a different opinion.
> > Also, with the TIDStore improvements for index cleanup, and the practical
> > elimination of multi-pass index vacuums, I see this being even less
> > convincing as something to add to a/v.
>
> If I understood correctly, then we are talking about the fact that
> TIDStore can store so many tuples that in fact a second pass is never
> needed.
> But the number of passes does not affect the presented optimization in
> any way. We must think about a large number of indexes that must be
> processed. Even within a single pass we can have a 40% increase in
> speed.
I am not discounting that a single table vacuum with many indexes will
maybe perform better with parallel index scan, I am merely saying that
the TIDStore optimization now makes index vacuums better and perhaps
there is less of an incentive to use parallel.
> > Now, If I am going to allocate extra workers to run vacuum in parallel, why
> > not just provide more autovacuum workers instead so I can get more tables
> > vacuumed within a span of time?
>
> For now, only one process can clean up indexes, so I don't see how
> increasing the number of a/v workers will help in the situation that I
> mentioned above.
> Also, we don't consume additional resources during autovacuum in this
> patch - total number of a/v workers always <= autovacuum_max_workers.
Increasing a/v workers will not help speed up a specific table, what I
am suggesting is that instead of speeding up one table, let's just allow
other tables to not be starved of a/v cycles due to lack of a/v workers.
--
Sami
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-02 22:06 Masahiko Sawada <[email protected]>
parent: Sami Imseih <[email protected]>
1 sibling, 0 replies; 112+ messages in thread
From: Masahiko Sawada @ 2025-05-02 22:06 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Fri, May 2, 2025 at 9:58 AM Sami Imseih <[email protected]> wrote:
>
> > Once we have parallel heap vacuum, as discussed in thread[1], it would
> > also likely be beneficial to incorporate it into autovacuum during
> > aggressive vacuum or failsafe mode.
>
> IIRC, index cleanup is disabled by failsafe.
Yes. My idea is to use parallel *heap* vacuum in autovacuum during
failsafe mode. I think it would make sense as users want to complete
freezing tables as soon as possible in this situation.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-02 22:27 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Masahiko Sawada @ 2025-05-02 22:27 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Fri, May 2, 2025 at 11:13 AM Daniil Davydov <[email protected]> wrote:
>
> On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <[email protected]> wrote:
> >
> > As I understand it, we initially disabled parallel vacuum for
> > autovacuum because their objectives are somewhat contradictory.
> > Parallel vacuum aims to accelerate the process by utilizing additional
> > resources, while autovacuum is designed to perform cleaning operations
> > with minimal impact on foreground transaction processing (e.g.,
> > through vacuum delay).
> >
> Yep, we also decided that we must not create more a/v workers for
> index processing.
> In current implementation, the leader process sends a signal to the
> a/v launcher, and the launcher tries to launch all requested workers.
> But the number of workers never exceeds `autovacuum_max_workers`.
> Thus, we will never have more a/v workers than in the standard case
> (without this feature).
I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.
>
> > Regarding implementation: I notice the WIP patch implements its own
> > parallel vacuum mechanism for autovacuum. Have you considered simply
> > setting at_params.nworkers to a value greater than zero?
> >
> About `at_params.nworkers = N` - that's exactly what we're doing (you
> can see it in the `vacuum_rel` function). But we cannot fully reuse
> code of VACUUM PARALLEL, because it creates its own processes via
> dynamic bgworkers machinery.
> As I said above - we don't want to consume additional resources. Also
> we don't want to complicate communication between processes (the idea
> is that a/v workers can only send signals to the a/v launcher).
Could you elaborate on the reasons why you don't want to use
background workers and avoid complicated communication between
processes? I'm not sure whether these concerns provide sufficient
justification for implementing its own parallel index processing.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-02 22:59 Sami Imseih <[email protected]>
parent: Masahiko Sawada <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Sami Imseih @ 2025-05-02 22:59 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
> I think it would more make
> sense to maintain the existing autovacuum_max_workers parameter while
> introducing a new parameter that would either control the maximum
> number of parallel vacuum workers per autovacuum worker or set a
> system-wide cap on the total number of parallel vacuum workers.
+1, and would it make sense for parallel workers to come from
max_parallel_maintenance_workers? This is capped by
max_parallel_workers and max_worker_processes, so increasing
the defaults for all 3 will be needed as well.
--
Sami
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-03 07:32 Daniil Davydov <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-05-03 07:32 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sat, May 3, 2025 at 3:17 AM Sami Imseih <[email protected]> wrote:
>
> I think in most cases, the user will want to determine the priority of
> a table getting parallel vacuum cycles rather than having the autovacuum
> determine the priority. I also see users wanting to stagger
> vacuums of large tables with many indexes through some time period,
> and give the
> tables the full amount of parallel workers they can afford at these
> specific periods
> of time. A/V currently does not really allow for this type of
> scheduling, and if we
> give some kind of GUC to prioritize tables, I think users will constantly have
> to be modifying this priority.
If the user wants to determine priority himself, we anyway need to
introduce some parameter (GUC or table option) that will give us a
hint how we should schedule a/v work.
You think that we should think about a more comprehensive behavior for
such a parameter (so that the user doesn't have to change it often)? I
will be glad to know your thoughts.
> > If I understood correctly, then we are talking about the fact that
> > TIDStore can store so many tuples that in fact a second pass is never
> > needed.
> > But the number of passes does not affect the presented optimization in
> > any way. We must think about a large number of indexes that must be
> > processed. Even within a single pass we can have a 40% increase in
> > speed.
>
> I am not discounting that a single table vacuum with many indexes will
> maybe perform better with parallel index scan, I am merely saying that
> the TIDStore optimization now makes index vacuums better and perhaps
> there is less of an incentive to use parallel.
I still insist that this does not affect the parallel index vacuum,
because we don't get an advantage in repeated passes. We get the same
speed increase whether we have this optimization or not.
Although it's even possible that the opposite is true - the situation
will be better with the new TIDStore, but I can't say for sure.
> > > Now, If I am going to allocate extra workers to run vacuum in parallel, why
> > > not just provide more autovacuum workers instead so I can get more tables
> > > vacuumed within a span of time?
> >
> > For now, only one process can clean up indexes, so I don't see how
> > increasing the number of a/v workers will help in the situation that I
> > mentioned above.
> > Also, we don't consume additional resources during autovacuum in this
> > patch - total number of a/v workers always <= autovacuum_max_workers.
>
> Increasing a/v workers will not help speed up a specific table, what I
> am suggesting is that instead of speeding up one table, let's just allow
> other tables to not be starved of a/v cycles due to lack of a/v workers.
OK, I got it. But what if vacuuming of a single table will take (for
example) 60% of all time? This is still a possible situation, and the
fast vacuum of all other tables will not help us.
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-03 08:10 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-05-03 08:10 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <[email protected]> wrote:
>
> > In current implementation, the leader process sends a signal to the
> > a/v launcher, and the launcher tries to launch all requested workers.
> > But the number of workers never exceeds `autovacuum_max_workers`.
> > Thus, we will never have more a/v workers than in the standard case
> > (without this feature).
>
> I have concerns about this design. When autovacuuming on a single
> table consumes all available autovacuum_max_workers slots with
> parallel vacuum workers, the system becomes incapable of processing
> other tables. This means that when determining the appropriate
> autovacuum_max_workers value, users must consider not only the number
> of tables to be processed concurrently but also the potential number
> of parallel workers that might be launched. I think it would more make
> sense to maintain the existing autovacuum_max_workers parameter while
> introducing a new parameter that would either control the maximum
> number of parallel vacuum workers per autovacuum worker or set a
> system-wide cap on the total number of parallel vacuum workers.
>
For now we have max_parallel_index_autovac_workers - this GUC limits
the number of parallel a/v workers that can process a single table. I
agree that the scenario you provided is problematic.
The proposal to limit the total number of supportive a/v workers seems
attractive to me (I'll implement it as an experiment).
It seems to me that this question is becoming a key one. First we need
to determine the role of the user in the whole scheduling mechanism.
Should we allow users to determine priority? Will this priority affect
only within a single vacuuming cycle, or it will be more 'global'?
I guess I don't have enough expertise to determine this alone. I will
be glad to receive any suggestions.
> > About `at_params.nworkers = N` - that's exactly what we're doing (you
> > can see it in the `vacuum_rel` function). But we cannot fully reuse
> > code of VACUUM PARALLEL, because it creates its own processes via
> > dynamic bgworkers machinery.
> > As I said above - we don't want to consume additional resources. Also
> > we don't want to complicate communication between processes (the idea
> > is that a/v workers can only send signals to the a/v launcher).
>
> Could you elaborate on the reasons why you don't want to use
> background workers and avoid complicated communication between
> processes? I'm not sure whether these concerns provide sufficient
> justification for implementing its own parallel index processing.
>
Here are my thoughts on this. A/v worker has a very simple role - it
is born after the launcher's request and must do exactly one 'task' -
vacuum table or participate in parallel index vacuum.
We also have a dedicated 'launcher' role, meaning the whole design
implies that only the launcher is able to launch processes.
If we allow a/v worker to use bgworkers, then :
1) A/v worker will go far beyond his responsibility.
2) Its functionality will overlap with the functionality of the launcher.
3) Resource consumption can jump dramatically, which is unexpected for
the user. Autovacuum will also be dependent on other resources
(bgworkers pool). The current design does not imply this.
I wanted to create a patch that would fit into the existing mechanism
without drastic innovations. But if you think that the above is not so
important, then we can reuse VACUUM PARALLEL code and it would
simplify the final implementation)
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-03 08:17 Daniil Davydov <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-05-03 08:17 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sat, May 3, 2025 at 5:59 AM Sami Imseih <[email protected]> wrote:
>
> > I think it would more make
> > sense to maintain the existing autovacuum_max_workers parameter while
> > introducing a new parameter that would either control the maximum
> > number of parallel vacuum workers per autovacuum worker or set a
> > system-wide cap on the total number of parallel vacuum workers.
>
> +1, and would it make sense for parallel workers to come from
> max_parallel_maintenance_workers? This is capped by
> max_parallel_workers and max_worker_processes, so increasing
> the defaults for all 3 will be needed as well.
I may be wrong, but the `max_parallel_maintenance_workers` parameter
is only used for commands that are explicitly run by the user. We
already have `autovacuum_max_workers` and I think that code will be
more consistent, if we adapt this particular parameter (perhaps with
the addition of a new one, as I wrote in the previous letter).
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-05 23:56 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Masahiko Sawada @ 2025-05-05 23:56 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <[email protected]> wrote:
>
> On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <[email protected]> wrote:
> >
> > > In current implementation, the leader process sends a signal to the
> > > a/v launcher, and the launcher tries to launch all requested workers.
> > > But the number of workers never exceeds `autovacuum_max_workers`.
> > > Thus, we will never have more a/v workers than in the standard case
> > > (without this feature).
> >
> > I have concerns about this design. When autovacuuming on a single
> > table consumes all available autovacuum_max_workers slots with
> > parallel vacuum workers, the system becomes incapable of processing
> > other tables. This means that when determining the appropriate
> > autovacuum_max_workers value, users must consider not only the number
> > of tables to be processed concurrently but also the potential number
> > of parallel workers that might be launched. I think it would more make
> > sense to maintain the existing autovacuum_max_workers parameter while
> > introducing a new parameter that would either control the maximum
> > number of parallel vacuum workers per autovacuum worker or set a
> > system-wide cap on the total number of parallel vacuum workers.
> >
>
> For now we have max_parallel_index_autovac_workers - this GUC limits
> the number of parallel a/v workers that can process a single table. I
> agree that the scenario you provided is problematic.
> The proposal to limit the total number of supportive a/v workers seems
> attractive to me (I'll implement it as an experiment).
>
> It seems to me that this question is becoming a key one. First we need
> to determine the role of the user in the whole scheduling mechanism.
> Should we allow users to determine priority? Will this priority affect
> only within a single vacuuming cycle, or it will be more 'global'?
> I guess I don't have enough expertise to determine this alone. I will
> be glad to receive any suggestions.
What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.
>
> > > About `at_params.nworkers = N` - that's exactly what we're doing (you
> > > can see it in the `vacuum_rel` function). But we cannot fully reuse
> > > code of VACUUM PARALLEL, because it creates its own processes via
> > > dynamic bgworkers machinery.
> > > As I said above - we don't want to consume additional resources. Also
> > > we don't want to complicate communication between processes (the idea
> > > is that a/v workers can only send signals to the a/v launcher).
> >
> > Could you elaborate on the reasons why you don't want to use
> > background workers and avoid complicated communication between
> > processes? I'm not sure whether these concerns provide sufficient
> > justification for implementing its own parallel index processing.
> >
>
> Here are my thoughts on this. A/v worker has a very simple role - it
> is born after the launcher's request and must do exactly one 'task' -
> vacuum table or participate in parallel index vacuum.
> We also have a dedicated 'launcher' role, meaning the whole design
> implies that only the launcher is able to launch processes.
>
> If we allow a/v worker to use bgworkers, then :
> 1) A/v worker will go far beyond his responsibility.
> 2) Its functionality will overlap with the functionality of the launcher.
While I agree that the launcher process is responsible for launching
autovacuum worker processes but I'm not sure it should be for
launching everything related autovacuums. It's quite possible that we
have parallel heap vacuum and processing the particular index with
parallel workers in the future. The code could get more complex if we
have the autovacuum launcher process launch such parallel workers too.
I believe it's more straightforward to divide the responsibility like
in a way that the autovacuum launcher is responsible for launching
autovacuum workers and autovacuum workers are responsible for
vacuuming tables no matter how to do that.
> 3) Resource consumption can jump dramatically, which is unexpected for
> the user.
What extra resources could be used if we use background workers
instead of autovacuum workers?
> Autovacuum will also be dependent on other resources
> (bgworkers pool). The current design does not imply this.
I see your point but I think it doesn't necessarily need to reflect it
at the infrastructure layer. For example, we can internally allocate
extra background worker slots for parallel vacuum workers based on
max_parallel_index_autovac_workers in addition to
max_worker_processes. Anyway we might need something to check or
validate max_worker_processes value to make sure that every autovacuum
worker can use the specified number of parallel workers for parallel
vacuum.
> I wanted to create a patch that would fit into the existing mechanism
> without drastic innovations. But if you think that the above is not so
> important, then we can reuse VACUUM PARALLEL code and it would
> simplify the final implementation)
I'd suggest using the existing infrastructure if we can achieve the
goal with it. If we find out there are some technical difficulties to
implement it without new infrastructure, we can revisit this approach.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-06 00:21 Sami Imseih <[email protected]>
parent: Masahiko Sawada <[email protected]>
1 sibling, 2 replies; 112+ messages in thread
From: Sami Imseih @ 2025-05-06 00:21 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
> On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <[email protected]>
> wrote:
> >
> > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <[email protected]>
> wrote:
> > >
> > > > In current implementation, the leader process sends a signal to the
> > > > a/v launcher, and the launcher tries to launch all requested workers.
> > > > But the number of workers never exceeds `autovacuum_max_workers`.
> > > > Thus, we will never have more a/v workers than in the standard case
> > > > (without this feature).
> > >
> > > I have concerns about this design. When autovacuuming on a single
> > > table consumes all available autovacuum_max_workers slots with
> > > parallel vacuum workers, the system becomes incapable of processing
> > > other tables. This means that when determining the appropriate
> > > autovacuum_max_workers value, users must consider not only the number
> > > of tables to be processed concurrently but also the potential number
> > > of parallel workers that might be launched. I think it would more make
> > > sense to maintain the existing autovacuum_max_workers parameter while
> > > introducing a new parameter that would either control the maximum
> > > number of parallel vacuum workers per autovacuum worker or set a
> > > system-wide cap on the total number of parallel vacuum workers.
> > >
> >
> > For now we have max_parallel_index_autovac_workers - this GUC limits
> > the number of parallel a/v workers that can process a single table. I
> > agree that the scenario you provided is problematic.
> > The proposal to limit the total number of supportive a/v workers seems
> > attractive to me (I'll implement it as an experiment).
> >
> > It seems to me that this question is becoming a key one. First we need
> > to determine the role of the user in the whole scheduling mechanism.
> > Should we allow users to determine priority? Will this priority affect
> > only within a single vacuuming cycle, or it will be more 'global'?
> > I guess I don't have enough expertise to determine this alone. I will
> > be glad to receive any suggestions.
>
> What I roughly imagined is that we don't need to change the entire
> autovacuum scheduling, but would like autovacuum workers to decides
> whether or not to use parallel vacuum during its vacuum operation
> based on GUC parameters (having a global effect) or storage parameters
> (having an effect on the particular table). The criteria of triggering
> parallel vacuum in autovacuum might need to be somewhat pessimistic so
> that we don't unnecessarily use parallel vacuum on many tables.
Perhaps we should only provide a reloption, therefore only tables specified
by the user via the reloption can be autovacuumed in parallel?
This gives a targeted approach. Of course if multiple of these allowed
tables
are to be autovacuumed at the same time, some may not get all the workers,
But that’s not different from if you are to manually vacuum in parallel the
tables
at the same time.
What do you think ?
—
Sami
>
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-06 04:54 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
1 sibling, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-05-06 04:54 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Tue, May 6, 2025 at 6:57 AM Masahiko Sawada <[email protected]> wrote:
>
> What I roughly imagined is that we don't need to change the entire
> autovacuum scheduling, but would like autovacuum workers to decides
> whether or not to use parallel vacuum during its vacuum operation
> based on GUC parameters (having a global effect) or storage parameters
> (having an effect on the particular table). The criteria of triggering
> parallel vacuum in autovacuum might need to be somewhat pessimistic so
> that we don't unnecessarily use parallel vacuum on many tables.
>
+1, I think about it in the same way. I will expand on this topic in
more detail in response to Sami's letter [1], so as not to repeat
myself.
> > Here are my thoughts on this. A/v worker has a very simple role - it
> > is born after the launcher's request and must do exactly one 'task' -
> > vacuum table or participate in parallel index vacuum.
> > We also have a dedicated 'launcher' role, meaning the whole design
> > implies that only the launcher is able to launch processes.
> >
> > If we allow a/v worker to use bgworkers, then :
> > 1) A/v worker will go far beyond his responsibility.
> > 2) Its functionality will overlap with the functionality of the launcher.
>
> While I agree that the launcher process is responsible for launching
> autovacuum worker processes but I'm not sure it should be for
> launching everything related autovacuums. It's quite possible that we
> have parallel heap vacuum and processing the particular index with
> parallel workers in the future. The code could get more complex if we
> have the autovacuum launcher process launch such parallel workers too.
> I believe it's more straightforward to divide the responsibility like
> in a way that the autovacuum launcher is responsible for launching
> autovacuum workers and autovacuum workers are responsible for
> vacuuming tables no matter how to do that.
It sounds very tempting. At the very beginning I did exactly that (to
make sure that nothing would break in a parallel autovacuum). Only
later it was decided to abandon the use of bgworkers.
For now both approaches look fair for me. What do you think - will
others agree that we can provide more responsibility to a/v workers?
> > 3) Resource consumption can jump dramatically, which is unexpected for
> > the user.
>
> What extra resources could be used if we use background workers
> instead of autovacuum workers?
I meant that more processes are starting to participate in the
autovacuum than indicated in autovacuum_max_workers. And if a/v worker
will use additional bgworkers => other operations cannot get these
resources.
> > Autovacuum will also be dependent on other resources
> > (bgworkers pool). The current design does not imply this.
>
> I see your point but I think it doesn't necessarily need to reflect it
> at the infrastructure layer. For example, we can internally allocate
> extra background worker slots for parallel vacuum workers based on
> max_parallel_index_autovac_workers in addition to
> max_worker_processes. Anyway we might need something to check or
> validate max_worker_processes value to make sure that every autovacuum
> worker can use the specified number of parallel workers for parallel
> vacuum.
I don't think that we can provide all supportive workers for each
parallel index vacuuming request. But I got your point - always keep
several bgworkers that only a/v workers can use if needed and the size
of this additional pool (depending on max_worker_processes) must be
user-configurable.
> > I wanted to create a patch that would fit into the existing mechanism
> > without drastic innovations. But if you think that the above is not so
> > important, then we can reuse VACUUM PARALLEL code and it would
> > simplify the final implementation)
>
> I'd suggest using the existing infrastructure if we can achieve the
> goal with it. If we find out there are some technical difficulties to
> implement it without new infrastructure, we can revisit this approach.
OK, in the near future I'll implement it and send a new patch to this
thread. I'll be glad if you will take a look on it)
[1] https://www.postgresql.org/message-id/CAA5RZ0vfBg%3Dc_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag%40mail.g...
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-06 05:15 Masahiko Sawada <[email protected]>
parent: Sami Imseih <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-05-06 05:15 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, May 5, 2025 at 5:21 PM Sami Imseih <[email protected]> wrote:
>
>
>> On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <[email protected]> wrote:
>> >
>> > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <[email protected]> wrote:
>> > >
>> > > > In current implementation, the leader process sends a signal to the
>> > > > a/v launcher, and the launcher tries to launch all requested workers.
>> > > > But the number of workers never exceeds `autovacuum_max_workers`.
>> > > > Thus, we will never have more a/v workers than in the standard case
>> > > > (without this feature).
>> > >
>> > > I have concerns about this design. When autovacuuming on a single
>> > > table consumes all available autovacuum_max_workers slots with
>> > > parallel vacuum workers, the system becomes incapable of processing
>> > > other tables. This means that when determining the appropriate
>> > > autovacuum_max_workers value, users must consider not only the number
>> > > of tables to be processed concurrently but also the potential number
>> > > of parallel workers that might be launched. I think it would more make
>> > > sense to maintain the existing autovacuum_max_workers parameter while
>> > > introducing a new parameter that would either control the maximum
>> > > number of parallel vacuum workers per autovacuum worker or set a
>> > > system-wide cap on the total number of parallel vacuum workers.
>> > >
>> >
>> > For now we have max_parallel_index_autovac_workers - this GUC limits
>> > the number of parallel a/v workers that can process a single table. I
>> > agree that the scenario you provided is problematic.
>> > The proposal to limit the total number of supportive a/v workers seems
>> > attractive to me (I'll implement it as an experiment).
>> >
>> > It seems to me that this question is becoming a key one. First we need
>> > to determine the role of the user in the whole scheduling mechanism.
>> > Should we allow users to determine priority? Will this priority affect
>> > only within a single vacuuming cycle, or it will be more 'global'?
>> > I guess I don't have enough expertise to determine this alone. I will
>> > be glad to receive any suggestions.
>>
>> What I roughly imagined is that we don't need to change the entire
>> autovacuum scheduling, but would like autovacuum workers to decides
>> whether or not to use parallel vacuum during its vacuum operation
>> based on GUC parameters (having a global effect) or storage parameters
>> (having an effect on the particular table). The criteria of triggering
>> parallel vacuum in autovacuum might need to be somewhat pessimistic so
>> that we don't unnecessarily use parallel vacuum on many tables.
>
>
> Perhaps we should only provide a reloption, therefore only tables specified
> by the user via the reloption can be autovacuumed in parallel?
>
> This gives a targeted approach. Of course if multiple of these allowed tables
> are to be autovacuumed at the same time, some may not get all the workers,
> But that’s not different from if you are to manually vacuum in parallel the tables
> at the same time.
>
> What do you think ?
+1. I think that's a good starting point. We can later introduce a new
GUC parameter that globally controls the maximum number of parallel
vacuum workers used in autovacuum, if necessary.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-06 05:16 Daniil Davydov <[email protected]>
parent: Sami Imseih <[email protected]>
1 sibling, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-05-06 05:16 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Tue, May 6, 2025 at 7:21 AM Sami Imseih <[email protected]> wrote:
>
> Perhaps we should only provide a reloption, therefore only tables specified
> by the user via the reloption can be autovacuumed in parallel?
Аfter your comments (earlier in this thread) I decided to do just
that. For now we have reloption, so the user can decide which tables
are "important" for parallel index vacuuming.
We also set lower bounds (hardcoded) on the number of indexes and the
number of dead tuples. For example, there is no need to use a parallel
vacuum if the table has only one index.
The situation is more complicated with the number of dead tuples - we
need tests that would show the optimal minimum value. This issue is
still being worked out.
> This gives a targeted approach. Of course if multiple of these allowed tables
> are to be autovacuumed at the same time, some may not get all the workers,
> But that’s not different from if you are to manually vacuum in parallel the tables
> at the same time.
I fully agree. Recently v2 patch has been supplemented with a new
feature [1] - multiple tables in a cluster can be processed in
parallel during autovacuum. And of course, not every a/v worker can
get enough supportive processes, but this is considered normal
behavior.
Maximum number of supportive workers is limited by the GUC variable.
[1] I guess that I'll send it within the v3 patch, that will also
contain logic that was discussed in the letter above - using bgworkers
instead of additional a/v workers. BTW, what do you think about this
idea?
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-06 20:11 Sami Imseih <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Sami Imseih @ 2025-05-06 20:11 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
> On Mon, May 5, 2025 at 5:21 PM Sami Imseih <[email protected]> wrote:
> >
> >
> >> On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <[email protected]> wrote:
> >> >
> >> > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <[email protected]> wrote:
> >> > >
> >> > > > In current implementation, the leader process sends a signal to the
> >> > > > a/v launcher, and the launcher tries to launch all requested workers.
> >> > > > But the number of workers never exceeds `autovacuum_max_workers`.
> >> > > > Thus, we will never have more a/v workers than in the standard case
> >> > > > (without this feature).
> >> > >
> >> > > I have concerns about this design. When autovacuuming on a single
> >> > > table consumes all available autovacuum_max_workers slots with
> >> > > parallel vacuum workers, the system becomes incapable of processing
> >> > > other tables. This means that when determining the appropriate
> >> > > autovacuum_max_workers value, users must consider not only the number
> >> > > of tables to be processed concurrently but also the potential number
> >> > > of parallel workers that might be launched. I think it would more make
> >> > > sense to maintain the existing autovacuum_max_workers parameter while
> >> > > introducing a new parameter that would either control the maximum
> >> > > number of parallel vacuum workers per autovacuum worker or set a
> >> > > system-wide cap on the total number of parallel vacuum workers.
> >> > >
> >> >
> >> > For now we have max_parallel_index_autovac_workers - this GUC limits
> >> > the number of parallel a/v workers that can process a single table. I
> >> > agree that the scenario you provided is problematic.
> >> > The proposal to limit the total number of supportive a/v workers seems
> >> > attractive to me (I'll implement it as an experiment).
> >> >
> >> > It seems to me that this question is becoming a key one. First we need
> >> > to determine the role of the user in the whole scheduling mechanism.
> >> > Should we allow users to determine priority? Will this priority affect
> >> > only within a single vacuuming cycle, or it will be more 'global'?
> >> > I guess I don't have enough expertise to determine this alone. I will
> >> > be glad to receive any suggestions.
> >>
> >> What I roughly imagined is that we don't need to change the entire
> >> autovacuum scheduling, but would like autovacuum workers to decides
> >> whether or not to use parallel vacuum during its vacuum operation
> >> based on GUC parameters (having a global effect) or storage parameters
> >> (having an effect on the particular table). The criteria of triggering
> >> parallel vacuum in autovacuum might need to be somewhat pessimistic so
> >> that we don't unnecessarily use parallel vacuum on many tables.
> >
> >
> > Perhaps we should only provide a reloption, therefore only tables specified
> > by the user via the reloption can be autovacuumed in parallel?
> >
> > This gives a targeted approach. Of course if multiple of these allowed tables
> > are to be autovacuumed at the same time, some may not get all the workers,
> > But that’s not different from if you are to manually vacuum in parallel the tables
> > at the same time.
> >
> > What do you think ?
>
> +1. I think that's a good starting point. We can later introduce a new
> GUC parameter that globally controls the maximum number of parallel
> vacuum workers used in autovacuum, if necessary.
and I this reloption should also apply to parallel heap vacuum in
non-failsafe scenarios.
In the failsafe case however, all tables will be eligible for parallel
vacuum. Anyhow, that
discussion could be taken in that thread, but wanted to point that out.
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-09 18:33 Daniil Davydov <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-05-09 18:33 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
As I promised - meet parallel index autovacuum with bgworkers
(Parallel-index-autovacuum-with-bgworkers.patch). This is pretty
simple implementation :
1) Added new table option `parallel_idx_autovac_enabled` that must be
set to `true` if user wants autovacuum to process table in parallel.
2) Added new GUC variable `autovacuum_reserved_workers_num`. This is
number of parallel workers from bgworkers pool that can be used only
by autovacuum workers. The `autovacuum_reserved_workers_num` parameter
actually reserves a requested part of the processes, the total number
of which is equal to `max_worker_processes`.
3) When an autovacuum worker decides to process some table in
parallel, it just sets `VacuumParams->nworkers` to appropriate value
(> 0) and then the code is executed as if it were a regular VACUUM
PARALLEL.
4) I kept test/modules/autovacuum as sandbox where you can play with
parallel index autovacuum a bit.
What do you think about this implementation?
P.S.
I also improved "self-managed" parallel autovacuum implementation
(Self-managed-parallel-index-autovacuum.patch). For now it needs a lot
of refactoring, but all features are working good.
Both patches are targeting on master branch
(bc35adee8d7ad38e7bef40052f196be55decddec)
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v1-0001-Parallel-index-autovacuum-with-bgworkers.patch (23.0K, 2-v1-0001-Parallel-index-autovacuum-with-bgworkers.patch)
download | inline diff:
From cfb7e675d9a1b05aef0cdaeeca5f6edd4bcd3b70 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 10 May 2025 01:07:42 +0700
Subject: [PATCH v1] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuum.c | 55 ++++++++
src/backend/commands/vacuumparallel.c | 46 ++++---
src/backend/postmaster/autovacuum.c | 9 ++
src/backend/postmaster/bgworker.c | 33 ++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 13 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 10 ++
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
.../autovacuum/t/001_autovac_parallel.pl | 129 ++++++++++++++++++
14 files changed, 307 insertions(+), 19 deletions(-)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..ccf59208783 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ {
+ {
+ "parallel_idx_autovac_enabled",
+ "Allows autovacuum to process indexes of this table in parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1863,6 +1872,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_idx_autovac_enabled", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_idx_autovac_enabled)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..f7667f14147 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,21 @@
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
+
/*
* Minimum interval for cost-based vacuum delay reports from a parallel worker.
* This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2246,49 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * If we are running autovacuum - decide whether we need to process indexes
+ * of table with given oid in parallel.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ RelationAllowsParallelIdxAutovac(rel))
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (av_reserved_workers_num > 0)
+ {
+ /*
+ * We request at least one parallel worker, if user set
+ * 'parallel_idx_autovac_enabled' option. The total number of
+ * additional parallel workers depends on how many indexes the
+ * table has. For now we assume that each parallel worker should
+ * process NUM_INDEXES_PER_PARALLEL_WORKER indexes.
+ */
+ params->nworkers =
+ Min((num_indexes / NUM_INDEXES_PER_PARALLEL_WORKER) + 1,
+ av_reserved_workers_num);
+ }
+ else
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("Cannot launch any supportive workers for parallel index cleanup of rel %s",
+ RelationGetRelationName(rel)),
+ errhint("You might need to set parameter \"av_reserved_workers_num\" to a value > 0")));
+
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..e2b3e5b343c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,15 +1,15 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (av_reserved_workers_num == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, av_reserved_workers_num) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -982,8 +996,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..725d3231f77 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3406,6 +3406,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_reserved_workers_num(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval > (max_worker_processes - 8))
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..cb86db99da9 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1046,6 +1046,8 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
BackgroundWorkerHandle **handle)
{
int slotno;
+ int from;
+ int upto;
bool success = false;
bool parallel;
uint64 generation = 0;
@@ -1088,10 +1090,23 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
return false;
}
+ /*
+ * Determine range of workers in pool, that we can use (last
+ * 'av_reserved_workers_num' is reserved for autovacuum workers).
+ */
+
+ from = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots - av_reserved_workers_num :
+ 0;
+
+ upto = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots :
+ BackgroundWorkerData->total_slots - av_reserved_workers_num;
+
/*
* Look for an unused slot. If we find one, grab it.
*/
- for (slotno = 0; slotno < BackgroundWorkerData->total_slots; ++slotno)
+ for (slotno = from; slotno < upto; ++slotno)
{
BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
@@ -1159,7 +1174,13 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
BackgroundWorkerSlot *slot;
pid_t pid;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'av_reserved_workers_num' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - av_reserved_workers_num);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - av_reserved_workers_num);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/*
@@ -1298,7 +1319,13 @@ TerminateBackgroundWorker(BackgroundWorkerHandle *handle)
BackgroundWorkerSlot *slot;
bool signal_postmaster = false;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'av_reserved_workers_num' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - av_reserved_workers_num);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - av_reserved_workers_num);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/* Set terminate flag in shared memory, unless slot has been reused. */
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 92b0446b80c..cff13ef6bd7 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -144,6 +144,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int av_reserved_workers_num = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..87cd4e20786 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,19 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_reserved_workers_num", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Number of worker processes, reserved for participation in parallel index processing during autovacuum."),
+ gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+ "*Only* autovacuum workers can use these additional processes. "
+ "Also, these processes are taken into account in \"max_parallel_workers\"."),
+ NULL,
+ },
+ &av_reserved_workers_num,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_reserved_workers_num, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..2e38bada2b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -223,6 +223,7 @@
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
+#autovacuum_reserved_workers_num = 0 # disabled by default and limited by max_parallel_workers
#parallel_leader_participation = on
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1e59a7f910f..992c6b63226 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int NBuffers;
extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
+extern PGDLLIMPORT int av_reserved_workers_num;
extern PGDLLIMPORT int max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..9913c6e4681 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+bool check_autovacuum_reserved_workers_num(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..55aa5c45be1 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,7 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ bool parallel_idx_autovac_enabled;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +410,15 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationAllowsParallelIdxAutovac
+ * Returns whether the relation's indexes can be processed in parallel
+ * during autovacuum. Note multiple eval of argument!
+ */
+#define RelationAllowsParallelIdxAutovac(relation) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_idx_autovac_enabled : false)
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..a37aaf720f2
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,129 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_reserved_workers_num = 1
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_idx_autovac_enabled = true);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
--
2.43.0
[text/x-patch] v3-0001-Self-managed-parallel-index-autovacuum.patch (69.0K, 3-v3-0001-Self-managed-parallel-index-autovacuum.patch)
download | inline diff:
From 96ab66f2bfe1146e20703b725b2aa8c91f6a237f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 9 May 2025 17:14:06 +0700
Subject: [PATCH v3] Meet parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 +
src/backend/commands/vacuum.c | 36 +
src/backend/commands/vacuumparallel.c | 286 ++++-
src/backend/postmaster/autovacuum.c | 1026 ++++++++++++++++-
src/backend/utils/misc/guc_tables.c | 10 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/include/postmaster/autovacuum.h | 27 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 13 +-
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 +
.../autovacuum/t/001_autovac_parallel.pl | 135 +++
12 files changed, 1517 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..b9d642a7a45 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ {
+ {
+ "parallel_idx_autovac",
+ "Enables autovacuum to process indexes of this table in parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1905,6 +1914,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
offsetof(StdRdOptions, vacuum_index_cleanup)},
{"vacuum_truncate", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, vacuum_truncate), offsetof(StdRdOptions, vacuum_truncate_set)},
+ {"parallel_idx_autovac", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, parallel_idx_autovac)},
{"vacuum_max_eager_freeze_failure_rate", RELOPT_TYPE_REAL,
offsetof(StdRdOptions, vacuum_max_eager_freeze_failure_rate)}
};
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..ab6706743a9 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,16 @@
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+/*
+ * Minimum number of dead tuples required for the table to be processed in
+ * parallel during autovacuum
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
/*
* Minimum interval for cost-based vacuum delay reports from a parallel worker.
* This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2241,35 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * Decide whether we need to process table with given oid in parallel mode
+ * during autovacuum.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ CanUseParallelIdxAutovacForRelation(rel))
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (max_parallel_index_autovac_workers > 0)
+ {
+ params->nworkers =
+ Min((num_indexes / AV_PARALLEL_INDEXES_PER_WORKER) + 1,
+ max_parallel_index_autovac_workers);
+ }
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..077f7a8ff6a 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,20 +1,23 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
- * multiple passes of index bulk-deletion and index cleanup.
+ * multiple passes of index bulk-deletion and index cleanup. For maintenance
+ * vacuum, we launch workers manually (using dynamic bgworkers machinery), and
+ * for autovacuum we send signals to the autovacuum launcher (all logic for
+ * communication among parallel autovacuum processes is in autovacuum.c).
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -34,9 +37,11 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
+#include "utils/memutils.h"
#include "utils/rel.h"
/*
@@ -157,11 +162,20 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
- /* NULL for worker processes */
+ /* Is this structure used for maintenance vacuum or autovacuum */
+ bool is_autovacuum;
+
+ /*
+ * NULL for worker processes.
+ *
+ * NOTE: Parallel autovacuum only needs a subset of the maintenance vacuum
+ * functionality.
+ */
ParallelContext *pcxt;
/* Parent Heap Relation */
@@ -221,6 +235,10 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static ParallelContext *CreateParallelAutoVacContext(int nworkers);
+static void InitializeParallelAutoVacDSM(ParallelContext *pcxt);
+static void DestroyParallelAutoVacContext(ParallelContext *pcxt);
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -280,15 +298,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
}
pvs = (ParallelVacuumState *) palloc0(sizeof(ParallelVacuumState));
+ pvs->is_autovacuum = AmAutoVacuumWorkerProcess();
pvs->indrels = indrels;
pvs->nindexes = nindexes;
pvs->will_parallel_vacuum = will_parallel_vacuum;
pvs->bstrategy = bstrategy;
pvs->heaprel = rel;
- EnterParallelMode();
- pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
- parallel_workers);
+ if (pvs->is_autovacuum)
+ pcxt = CreateParallelAutoVacContext(parallel_workers);
+ else
+ {
+ EnterParallelMode();
+ pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+ parallel_workers);
+ }
Assert(pcxt->nworkers > 0);
pvs->pcxt = pcxt;
@@ -327,7 +351,10 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
else
querylen = 0; /* keep compiler quiet */
- InitializeParallelDSM(pcxt);
+ if (pvs->is_autovacuum)
+ InitializeParallelAutoVacDSM(pvs->pcxt);
+ else
+ InitializeParallelDSM(pcxt);
/* Prepare index vacuum stats */
indstats = (PVIndStats *) shm_toc_allocate(pcxt->toc, est_indstats_len);
@@ -371,10 +398,16 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ if (pvs->is_autovacuum)
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -453,8 +486,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
- DestroyParallelContext(pvs->pcxt);
- ExitParallelMode();
+ if (pvs->is_autovacuum)
+ DestroyParallelAutoVacContext(pvs->pcxt);
+ else
+ {
+ DestroyParallelContext((ParallelContext *) pvs->pcxt);
+ ExitParallelMode();
+ }
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
@@ -532,6 +570,144 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
}
+/*
+ * Short version of CreateParallelContext (parallel.c). Here we init only those
+ * fields that are needed for parallel index processing during autovacuum.
+ */
+static ParallelContext *
+CreateParallelAutoVacContext(int nworkers)
+{
+ ParallelContext *pcxt;
+ MemoryContext oldcontext;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Number of workers should be non-negative. */
+ Assert(nworkers >= 0);
+
+ /* We might be running in a short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Initialize a new ParallelContext. */
+ pcxt = palloc0(sizeof(ParallelContext));
+ pcxt->nworkers = nworkers;
+ pcxt->nworkers_to_launch = nworkers;
+ shm_toc_initialize_estimator(&pcxt->estimator);
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+
+ return pcxt;
+}
+
+/*
+ * Short version of InitializeParallelDSM (parallel.c). Here we put into dsm
+ * only those data that are needed for parallel index processing during
+ * autovacuum.
+ */
+static void
+InitializeParallelAutoVacDSM(ParallelContext *pcxt)
+{
+ MemoryContext oldcontext;
+ Size tsnaplen = 0;
+ Size asnaplen = 0;
+ Size segsize = 0;
+ char *tsnapspace;
+ char *asnapspace;
+ Snapshot transaction_snapshot = GetTransactionSnapshot();
+ Snapshot active_snapshot = GetActiveSnapshot();
+
+ Assert(pcxt->nworkers >= 1);
+
+ /* We might be running in a very short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnaplen = EstimateSnapshotSpace(transaction_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, tsnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+ }
+ asnaplen = EstimateSnapshotSpace(active_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, asnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+
+ /* Create DSM and initialize with new table of contents. */
+ segsize = shm_toc_estimate(&pcxt->estimator);
+ pcxt->seg = dsm_create(segsize, DSM_CREATE_NULL_IF_MAXSEGMENTS);
+
+ if (pcxt->seg == NULL)
+ {
+ pcxt->nworkers = 0;
+ pcxt->private_memory = MemoryContextAlloc(TopMemoryContext, segsize);
+ }
+
+ pcxt->toc = shm_toc_create(AV_PARALLEL_MAGIC,
+ pcxt->seg == NULL ? pcxt->private_memory :
+ dsm_segment_address(pcxt->seg),
+ segsize);
+
+ /* We can skip the rest of this if we're not budgeting for any workers. */
+ if (pcxt->nworkers > 0)
+ {
+ /*
+ * Serialize the transaction snapshot if the transaction isolation
+ * level uses a transaction snapshot.
+ */
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnapspace = shm_toc_allocate(pcxt->toc, tsnaplen);
+ SerializeSnapshot(transaction_snapshot, tsnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT,
+ tsnapspace);
+ }
+
+ /* Serialize the active snapshot. */
+ asnapspace = shm_toc_allocate(pcxt->toc, asnaplen);
+ SerializeSnapshot(active_snapshot, asnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, asnapspace);
+ }
+
+ /* Update nworkers_to_launch, in case we changed nworkers above. */
+ pcxt->nworkers_to_launch = pcxt->nworkers;
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Short version of DestroyParallelContext (parallel.c). Here we clean up only
+ * those data that were used during parallel index processing during autovacuum.
+ */
+static void
+DestroyParallelAutoVacContext(ParallelContext *pcxt)
+{
+ /*
+ * If we have allocated a shared memory segment, detach it. This will
+ * implicitly detach the error queues, and any other shared memory queues,
+ * stored there.
+ */
+ if (pcxt->seg != NULL)
+ {
+ dsm_detach(pcxt->seg);
+ pcxt->seg = NULL;
+ }
+
+ /*
+ * If this parallel context is actually in backend-private memory rather
+ * than shared memory, free that memory instead.
+ */
+ if (pcxt->private_memory != NULL)
+ {
+ pfree(pcxt->private_memory);
+ pcxt->private_memory = NULL;
+ }
+
+ AutoVacuumReleaseParallelWork(false);
+ pfree(pcxt);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -558,7 +734,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_index_autovac_workers == 0 && AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +775,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, max_parallel_index_autovac_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -670,7 +850,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
- if (num_index_scans > 0)
+ if (num_index_scans > 0 && !pvs->is_autovacuum)
ReinitializeParallelDSM(pvs->pcxt);
/*
@@ -686,9 +866,22 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
* The number of workers can vary between bulkdelete and cleanup
* phase.
*/
- ReinitializeParallelWorkers(pvs->pcxt, nworkers);
-
- LaunchParallelWorkers(pvs->pcxt);
+ if (pvs->is_autovacuum)
+ {
+ pvs->pcxt->nworkers_to_launch = Min(pvs->pcxt->nworkers, nworkers);
+ if (pvs->pcxt->nworkers > 0 && pvs->pcxt->nworkers_to_launch > 0)
+ {
+ pvs->pcxt->nworkers_launched =
+ LaunchParallelAutovacuumWorkers(pvs->heaprel->rd_id,
+ pvs->pcxt->nworkers_to_launch,
+ dsm_segment_handle(pvs->pcxt->seg));
+ }
+ }
+ else
+ {
+ ReinitializeParallelWorkers(pvs->pcxt, nworkers);
+ LaunchParallelWorkers(pvs->pcxt);
+ }
if (pvs->pcxt->nworkers_launched > 0)
{
@@ -733,8 +926,14 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
if (nworkers > 0)
{
- /* Wait for all vacuum workers to finish */
- WaitForParallelWorkersToFinish(pvs->pcxt);
+ /*
+ * Wait for all [auto]vacuum workers involved in parallel index
+ * processing (if any) to finish and advance state machine.
+ */
+ if (pvs->is_autovacuum && pvs->pcxt->nworkers_launched >= 0)
+ ParallelAutovacuumEndSyncPoint(false);
+ else if (!pvs->is_autovacuum)
+ WaitForParallelWorkersToFinish(pvs->pcxt);
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
@@ -982,8 +1181,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
@@ -997,23 +1196,22 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
BufferUsage *buffer_usage;
WalUsage *wal_usage;
int nindexes;
+ int worker_number;
char *sharedquery;
ErrorContextCallback errcallback;
- /*
- * A parallel vacuum worker must have only PROC_IN_VACUUM flag since we
- * don't support parallel vacuum for autovacuum as of now.
- */
- Assert(MyProc->statusFlags == PROC_IN_VACUUM);
-
- elog(DEBUG1, "starting parallel vacuum worker");
+ Assert(MyProc->statusFlags == PROC_IN_VACUUM || AmAutoVacuumWorkerProcess());
+ elog(DEBUG1, "starting parallel [auto]vacuum worker");
shared = (PVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
/* Set debug_query_string for individual workers */
- sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
- debug_query_string = sharedquery;
- pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+ debug_query_string = sharedquery;
+ pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ }
/* Track query ID */
pgstat_report_query_id(shared->queryid, false);
@@ -1091,8 +1289,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
- InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber],
- &wal_usage[ParallelWorkerNumber]);
+
+ worker_number = AmAutoVacuumWorkerProcess() ?
+ GetAutoVacuumParallelWorkerNumber() : ParallelWorkerNumber;
+
+ InstrEndParallelQuery(&buffer_usage[worker_number],
+ &wal_usage[worker_number]);
/* Report any remaining cost-based vacuum delay time */
if (track_cost_delay_timing)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..040af5ebc14 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -90,6 +90,7 @@
#include "postmaster/postmaster.h"
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/lmgr.h"
@@ -101,6 +102,7 @@
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
+#include "utils/inval.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -129,6 +131,7 @@ int autovacuum_anl_thresh;
double autovacuum_anl_scale;
int autovacuum_freeze_max_age;
int autovacuum_multixact_freeze_max_age;
+int max_parallel_index_autovac_workers;
double autovacuum_vac_cost_delay;
int autovacuum_vac_cost_limit;
@@ -164,6 +167,14 @@ static int default_freeze_table_age;
static int default_multixact_freeze_min_age;
static int default_multixact_freeze_table_age;
+/*
+ * Number of additional workers that was requested for parallel index processing
+ * during autovacuum.
+ */
+static int nworkers_for_idx_autovac = 0;
+
+static int nworkers_launched = 0;
+
/* Memory context for long-lived data */
static MemoryContext AutovacMemCxt;
@@ -210,6 +221,9 @@ typedef struct autovac_table
char *at_datname;
} autovac_table;
+/* Forward declaration */
+typedef struct ParallelAutoVacuumWorkItem ParallelAutoVacuumWorkItem;
+
/*-------------
* This struct holds information about a single worker's whereabouts. We keep
* an array of these in shared memory, sized according to
@@ -222,6 +236,10 @@ typedef struct autovac_table
* wi_proc pointer to PGPROC of the running worker, NULL if not started
* wi_launchtime Time at which this worker was launched
* wi_dobalance Whether this worker should be included in balance calculations
+ * wi_pcleanup if (> 0) => this worker must participate in parallel index
+ * vacuuming as supportive. Must be (== 0) for leader worker.
+ * wi_target_item used only for parallel index vacuum supportive workers. Points
+ * to workitem, that must be processed by this worker.
*
* All fields are protected by AutovacuumLock, except for wi_tableoid and
* wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -237,10 +255,22 @@ typedef struct WorkerInfoData
TimestampTz wi_launchtime;
pg_atomic_flag wi_dobalance;
bool wi_sharedrel;
+ int wi_pcleanup;
+ struct ParallelAutoVacuumWorkItem *wi_target_item;
} WorkerInfoData;
typedef struct WorkerInfoData *WorkerInfo;
+#define AmParallelIdxAutoVacSupportive() \
+ (MyWorkerInfo != NULL && \
+ MyWorkItem != NULL && \
+ MyWorkerInfo->wi_pcleanup > 0)
+
+#define AmParallelIdxAutoVacLeader() \
+ (MyWorkerInfo != NULL && \
+ MyWorkItem != NULL && \
+ MyWorkerInfo->wi_pcleanup == 0)
+
/*
* Possible signals received by the launcher from remote processes. These are
* stored atomically in shared memory so that other processes can set them
@@ -250,9 +280,10 @@ typedef enum
{
AutoVacForkFailed, /* failed trying to start a worker */
AutoVacRebalance, /* rebalance the cost limits */
+ AutoVacParallelReq, /* request for parallel index vacuum */
} AutoVacuumSignal;
-#define AutoVacNumSignals (AutoVacRebalance + 1)
+#define AutoVacNumSignals (AutoVacParallelReq + 1)
/*
* Autovacuum workitem array, stored in AutoVacuumShmem->av_workItems. This
@@ -272,6 +303,55 @@ typedef struct AutoVacuumWorkItem
#define NUM_WORKITEMS 256
+typedef enum
+{
+ LAUNCHER = 0, /* autovacuum launcher must wake everyone up */
+ LEADER, /* leader must wake everyone up */
+ LAST_WORKER, /* the last inited supportive worker must wake everyone
+ up */
+} SyncType;
+
+typedef enum
+{
+ STARTUP = 0, /* initial value - no sync points were passed */
+ START_SYNC_POINT_PASSED, /* start_sync_point was passed */
+ END_SYNC_POINT_PASSED, /* end_sync_point was passed */
+ SHUTDOWN, /* leader wants to shut down parallel index
+ vacuum due to occured error */
+} Status;
+
+/*
+ * Structure, stored in AutoVacuumShmem->pav_workItem. This is used for managing
+ * parallel index processing (whithin single table).
+ */
+struct ParallelAutoVacuumWorkItem
+{
+ Oid avw_database;
+ Oid avw_relation;
+ int nworkers_participating;
+ int nworkers_to_launch;
+ int nworkers_sleeping; /* leader doesn't count */
+ int nfinished; /* # of workers, that already finished parallel
+ index processing (and probably already dead) */
+
+ dsm_handle handl;
+ int leader_proc_pid;
+
+ PGPROC *leader_proc;
+ ConditionVariable cv;
+
+ bool active; /* being processed */
+ bool leader_sleeping_on_ssp; /* sleeping on start sync point */
+ bool leader_sleeping_on_esp; /* sleeping on end sync point */
+ SyncType sync_type;
+ Status status;
+
+ bool needs_launcher;
+ TimestampTz birthtime;
+};
+
+static ParallelAutoVacuumWorkItem *MyWorkItem = NULL;
+
/*-------------
* The main autovacuum shmem struct. On shared memory we store this main
* struct and the array of WorkerInfo structs. This struct keeps:
@@ -283,6 +363,10 @@ typedef struct AutoVacuumWorkItem
* av_startingWorker pointer to WorkerInfo currently being started (cleared by
* the worker itself as soon as it's up and running)
* av_workItems work item array
+ * pav_workItems array of control structures needed for parallel index
+ * processing
+ * pav_workers_left how many workers we can launch for parallel index processing
+ * (must always be < autovacuum_max_workers)
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
*
@@ -298,6 +382,8 @@ typedef struct
dlist_head av_runningWorkers;
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+ ParallelAutoVacuumWorkItem pav_workItems[NUM_WORKITEMS];
+ int pav_workers_left;
pg_atomic_uint32 av_nworkersForBalance;
} AutoVacuumShmemStruct;
@@ -322,11 +408,17 @@ pg_noreturn static void AutoVacLauncherShutdown(void);
static void launcher_determine_sleep(bool canlaunch, bool recursing,
struct timeval *nap);
static void launch_worker(TimestampTz now);
+static void launch_worker_for_pcleanup(TimestampTz now);
+static void eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item,
+ bool all_launched);
static List *get_database_list(void);
static void rebuild_database_list(Oid newdb);
static int db_comparator(const void *a, const void *b);
static void autovac_recalculate_workers_for_balance(void);
+static int parallel_autovacuum_start_sync_point(bool keep_lock);
+static void handle_parallel_idx_autovac_errors(void);
+
static void do_autovacuum(void);
static void FreeWorkerInfo(int code, Datum arg);
@@ -355,7 +447,61 @@ static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+typedef bool (*wakeup_condition) (ParallelAutoVacuumWorkItem *item);
+static bool start_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item);
+static bool end_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item);
+static void CVSleep(ParallelAutoVacuumWorkItem *item, wakeup_condition wakeup_cond);
+/*
+ * Returns pointer to free work item, that can be used for parallel index
+ * vacuuming, or NULL if there is no such work items.
+ */
+static ParallelAutoVacuumWorkItem *
+get_free_workitem_for_leader(void)
+{
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ for (int i = 0; i < NUM_WORKITEMS; i++)
+ {
+ ParallelAutoVacuumWorkItem *item = &AutoVacuumShmem->pav_workItems[i];
+
+ if (item->active && item->leader_proc_pid != MyProcPid)
+ continue;
+
+ return item;
+ }
+
+ return NULL;
+}
+
+/*
+ * Returns pointer to work item, that must be processed by autovacuum launcher,
+ * or NULL if there is no such work items.
+ */
+static ParallelAutoVacuumWorkItem *
+get_free_workitem_for_launcher(void)
+{
+ TimestampTz latest = GetCurrentTimestamp();
+ ParallelAutoVacuumWorkItem *item = NULL;
+
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ for (int i = 0; i < NUM_WORKITEMS; i++)
+ {
+ ParallelAutoVacuumWorkItem *tmp = &AutoVacuumShmem->pav_workItems[i];
+
+ if (!tmp->needs_launcher)
+ continue;
+
+ if (latest > tmp->birthtime)
+ {
+ latest = tmp->birthtime;
+ item = tmp;
+ }
+ }
+
+ return item;
+}
/********************************************************************
* AUTOVACUUM LAUNCHER CODE
@@ -583,7 +729,15 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
* wakening conditions.
*/
- launcher_determine_sleep(av_worker_available(), false, &nap);
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ /* Take the smallest possible sleep interval. */
+ nap.tv_sec = 0;
+ nap.tv_usec = MIN_AUTOVAC_SLEEPTIME * 1000;
+ }
+ else
+ launcher_determine_sleep(!dclist_is_empty(&AutoVacuumShmem->av_freeWorkers),
+ false, &nap);
/*
* Wait until naptime expires or we get some type of signal (all the
@@ -598,6 +752,23 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
ProcessAutoVacLauncherInterrupts();
+ if (MyWorkItem == NULL && max_parallel_index_autovac_workers > 0)
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->pav_workers_left > 0)
+ {
+ MyWorkItem = get_free_workitem_for_launcher();
+ if (MyWorkItem != NULL)
+ {
+ Assert(MyWorkItem->active == true);
+ MyWorkItem->needs_launcher = false;
+ nworkers_for_idx_autovac = MyWorkItem->nworkers_to_launch;
+ nworkers_launched = 0;
+ }
+ }
+ LWLockRelease(AutovacuumLock);
+ }
+
/*
* a worker finished, or postmaster signaled failure to start a worker
*/
@@ -614,6 +785,22 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
LWLockRelease(AutovacuumLock);
}
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_signal[AutoVacParallelReq])
+ {
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = false;
+
+ if (MyWorkItem == NULL)
+ {
+ MyWorkItem = get_free_workitem_for_launcher();
+ Assert(MyWorkItem != NULL && MyWorkItem->active == true);
+ MyWorkItem->needs_launcher = false;
+ nworkers_for_idx_autovac = MyWorkItem->nworkers_to_launch;
+ nworkers_launched = 0;
+ }
+ }
+ LWLockRelease(AutovacuumLock);
+
if (AutoVacuumShmem->av_signal[AutoVacForkFailed])
{
/*
@@ -686,6 +873,8 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
worker->wi_sharedrel = false;
worker->wi_proc = NULL;
worker->wi_launchtime = 0;
+ worker->wi_pcleanup = -1;
+ worker->wi_target_item = NULL;
dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker->wi_links);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -698,9 +887,27 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
}
LWLockRelease(AutovacuumLock); /* either shared or exclusive */
- /* if we can't do anything, just go back to sleep */
if (!can_launch)
+ {
+ /*
+ * If launcher cannot launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts and
+ * tell everyone, that there will no new supportive workers.
+ */
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ Assert(MyWorkItem->active);
+
+ eliminate_lock_conflicts(MyWorkItem, false);
+ nworkers_launched = nworkers_for_idx_autovac = 0;
+ MyWorkItem = NULL;
+ LWLockRelease(AutovacuumLock);
+ }
+
+ /* if we can't do anything else, just go back to sleep */
continue;
+ }
/* We're OK to start a new worker */
@@ -716,6 +923,38 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
*/
launch_worker(current_time);
}
+ else if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Check whether we reach the limit of supportive workers.
+ */
+ if (AutoVacuumShmem->pav_workers_left == 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("cannot launch more a/v workers for parallel index cleanup of rel %d in database %d",
+ MyWorkItem->avw_relation, MyWorkItem->avw_database),
+ errhint("You might need to increase \"max_parallel_index_autovac_workers\" parameter")));
+ eliminate_lock_conflicts(MyWorkItem, false);
+ nworkers_launched = nworkers_for_idx_autovac = 0;
+ MyWorkItem = NULL;
+ }
+ else
+ {
+ /*
+ * One of active autovacuum workers sent us request to lauch
+ * participants for parallel index vacuum. We check this case first
+ * because we need to start participants as soon as possible.
+ */
+ launch_worker_for_pcleanup(current_time);
+ AutoVacuumShmem->pav_workers_left -= 1;
+ }
+
+ LWLockRelease(AutovacuumLock);
+ }
else
{
/*
@@ -1267,6 +1506,8 @@ do_start_worker(void)
worker->wi_dboid = avdb->adw_datid;
worker->wi_proc = NULL;
worker->wi_launchtime = GetCurrentTimestamp();
+ worker->wi_pcleanup = -1;
+ worker->wi_target_item = NULL;
AutoVacuumShmem->av_startingWorker = worker;
@@ -1349,6 +1590,132 @@ launch_worker(TimestampTz now)
}
}
+/*
+ * launch_worker_for_pcleanup
+ *
+ * Wrapper for starting a worker (requested by leader of parallel index
+ * vacuuming) from the launcher.
+ */
+static void
+launch_worker_for_pcleanup(TimestampTz now)
+{
+ WorkerInfo worker;
+ dlist_node *wptr;
+
+ Assert(MyWorkItem != NULL);
+ Assert(nworkers_launched < nworkers_for_idx_autovac);
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ /*
+ * Get a worker entry from the freelist. We checked above, so there
+ * really should be a free slot.
+ */
+ wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+
+ worker = dlist_container(WorkerInfoData, wi_links, wptr);
+ worker->wi_dboid = InvalidOid;
+ worker->wi_proc = NULL;
+ worker->wi_launchtime = GetCurrentTimestamp();
+ worker->wi_target_item = MyWorkItem;
+
+ /*
+ * Set indicator, that this workers must join to parallel index vacuum.
+ * This variable also plays the role of an unique id among parallel index
+ * vacuum workers. First id is '1', because '0' is reserved for leader.
+ */
+ worker->wi_pcleanup = (nworkers_launched + 1);
+
+ AutoVacuumShmem->av_startingWorker = worker;
+
+ SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
+
+ Assert(MyWorkItem->active);
+
+ nworkers_launched += 1;
+
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ return;
+
+ Assert(MyWorkItem->sync_type == LAUNCHER &&
+ nworkers_launched == nworkers_for_idx_autovac);
+
+ /*
+ * If launcher managed to launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts.
+ */
+ eliminate_lock_conflicts(MyWorkItem, true);
+ MyWorkItem = NULL;
+}
+
+/*
+ * Must be called from autovacuum launcher when it launched all requested
+ * workers for parallel index vacuum, or when it realized, that no more
+ * processes can be launched.
+ *
+ * In this function launcher will assign roles in such a way as to avoid lock
+ * conflicts between leader and supportive workers.
+ *
+ * AutovacuumLock must be held in exclusive mode before calling this function!
+ */
+static void
+eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item, bool all_launched)
+{
+ Assert(AmAutoVacuumLauncherProcess());
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ /* So, let's start... */
+
+ if (item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If both leader and all launched supportive workers are sleeping, then
+ * only we can wake everyone up.
+ */
+ ConditionVariableBroadcast(&item->cv);
+
+ /* Advance status. */
+ item->status = START_SYNC_POINT_PASSED;
+ }
+ else if (item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping < nworkers_launched)
+ {
+ /*
+ * If leader already sleeping, but several supportive workers are
+ * initing, we shift the responsibility for awakening everyone into the
+ * worker who completes initialization last
+ */
+ item->sync_type = LAST_WORKER;
+ }
+ else if (!item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If only leader is not sleeping - it must wake up all workers when it
+ * finishes all preparations.
+ */
+ item->sync_type = LEADER;
+ }
+ else
+ {
+ /*
+ * If nobody is sleeping, we assume that leader has higher chanses to
+ * asleep first, so set sync type to LAST_WORKER, but if the last worker
+ * will see that leader still not sleeping, it will change sync type to
+ * LEADER and asleep.
+ */
+ item->sync_type = LAST_WORKER;
+ }
+
+ /*
+ * If we cannot launch all requested workers, refresh
+ * nworkers_to_launch value, so that the last worker can find out
+ * that he is really the last.
+ */
+ if (!all_launched && item->sync_type == LAST_WORKER)
+ item->nworkers_to_launch = nworkers_launched;
+}
+
/*
* Called from postmaster to signal a failure to fork a process to become
* worker. The postmaster should kill(SIGUSR2) the launcher shortly
@@ -1360,6 +1727,38 @@ AutoVacWorkerFailed(void)
AutoVacuumShmem->av_signal[AutoVacForkFailed] = true;
}
+/*
+ * Called from autovacuum worker to signal that he needs participants in
+ * parallel index vacuum. Function sends SIGUSR2 to the launcher and returns
+ * 'true' iff signal was sent successfully.
+ */
+bool
+AutoVacParallelWorkRequest(void)
+{
+ if (AutoVacuumShmem->av_launcherpid == 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("autovacuum launcher is dead")));
+
+ return false;
+ }
+
+ if (kill(AutoVacuumShmem->av_launcherpid, SIGUSR2) < 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_SYSTEM_ERROR),
+ errmsg("failed to send signal to autovac launcher (pid %d): %m",
+ AutoVacuumShmem->av_launcherpid)));
+
+ return false;
+ }
+
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = true;
+ return true;
+}
+
+
/* SIGUSR2: a worker is up and running, or just finished, or failed to fork */
static void
avl_sigusr2_handler(SIGNAL_ARGS)
@@ -1559,6 +1958,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
{
char dbname[NAMEDATALEN];
+ Assert(MyWorkerInfo->wi_pcleanup < 0);
+
/*
* Report autovac startup to the cumulative stats system. We
* deliberately do this before InitPostgres, so that the
@@ -1593,12 +1994,122 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
+ else if (MyWorkerInfo->wi_target_item != NULL)
+ {
+ dsm_handle handle;
+ PGPROC *leader_proc;
+ int leader_proc_pid;
+ dsm_segment *seg;
+ shm_toc *toc;
+ char *asnapspace;
+ char *tsnapspace;
+ char dbname[NAMEDATALEN];
+ Snapshot tsnapshot;
+ Snapshot asnapshot;
+
+ /*
+ * We will abort parallel index vacuuming whithin current process if
+ * something errors out
+ */
+ PG_TRY();
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ MyWorkItem = MyWorkerInfo->wi_target_item;
+ dbid = MyWorkItem->avw_database;
+ handle = MyWorkItem->handl;
+ leader_proc = MyWorkItem->leader_proc;
+ leader_proc_pid = MyWorkItem->leader_proc_pid;
+ LWLockRelease(AutovacuumLock);
+
+ InitPostgres(NULL, dbid, NULL, InvalidOid,
+ INIT_PG_OVERRIDE_ALLOW_CONNS,
+ dbname);
+
+ set_ps_display(dbname);
+ if (PostAuthDelay)
+ pg_usleep(PostAuthDelay * 1000000L);
+
+ /* And do an appropriate amount of work */
+ recentXid = ReadNextTransactionId();
+ recentMulti = ReadNextMultiXactId();
+
+ if (parallel_autovacuum_start_sync_point(false) == -1)
+ {
+ /* We are not participating anymore */
+ MyWorkItem = NULL;
+ MyWorkerInfo->wi_pcleanup = -1;
+ goto exit;
+ }
+
+ seg = dsm_attach(handle);
+ if (seg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not map dynamic shared memory segment")));
+
+ toc = shm_toc_attach(AV_PARALLEL_MAGIC, dsm_segment_address(seg));
+ if (toc == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("invalid magic number in dynamic shared memory segment")));
+
+ if (!BecomeLockGroupMember(leader_proc, leader_proc_pid))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not become lock group member")));
+ }
+
+ StartTransactionCommand();
+
+ asnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, false);
+ tsnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT, true);
+ asnapshot = RestoreSnapshot(asnapspace);
+ tsnapshot = tsnapspace ? RestoreSnapshot(tsnapspace) : asnapshot;
+ RestoreTransactionSnapshot(tsnapshot, leader_proc);
+ PushActiveSnapshot(asnapshot);
+
+ /*
+ * We've changed which tuples we can see, and must therefore
+ * invalidate system caches.
+ */
+ InvalidateSystemCaches();
+
+ parallel_vacuum_main(seg, toc);
+
+ /* Must pop active snapshot so snapmgr.c doesn't complain. */
+ PopActiveSnapshot();
+
+ dsm_detach(seg);
+ CommitTransactionCommand();
+ ParallelAutovacuumEndSyncPoint(false);
+ }
+ PG_CATCH();
+ {
+ EmitErrorReport();
+ if (AmParallelIdxAutoVacSupportive())
+ handle_parallel_idx_autovac_errors();
+ }
+ PG_END_TRY();
+ }
/*
* The launcher will be notified of my death in ProcKill, *if* we managed
* to get a worker slot at all
*/
+exit:
+
+ if (MyWorkerInfo->wi_target_item != NULL)
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->pav_workers_left += 1;
+ Assert(AutoVacuumShmem->pav_workers_left <= max_parallel_index_autovac_workers);
+ LWLockRelease(AutovacuumLock);
+ }
+
/* All done, go away */
proc_exit(0);
}
@@ -2461,6 +2972,10 @@ do_autovacuum(void)
tab->at_datname, tab->at_nspname, tab->at_relname);
EmitErrorReport();
+ /* if we are parallel index vacuuming leader, we must shut it down */
+ if (AmParallelIdxAutoVacLeader())
+ handle_parallel_idx_autovac_errors();
+
/* this resets ProcGlobal->statusFlags[i] too */
AbortOutOfAnyTransaction();
FlushErrorState();
@@ -3296,6 +3811,492 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Release work item, used for managing parallel index vacuum. Must be called
+ * once and only from leader worker.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+AutoVacuumReleaseParallelWork(bool keep_lock)
+{
+ /*
+ * We might not get the workitem from launcher (we must not be considered
+ * as leader in this case), so just leave.
+ */
+ if (!AmParallelIdxAutoVacLeader())
+ return;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ Assert(AmParallelIdxAutoVacLeader() &&
+ MyWorkItem->leader_proc_pid == MyProcPid);
+
+ MyWorkItem->leader_proc = NULL;
+ MyWorkItem->leader_proc_pid = 0;
+ MyWorkItem->active = false;
+ MyWorkItem = NULL;
+
+ /* We are not leader anymore. */
+ MyWorkerInfo->wi_pcleanup = -1;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+}
+
+static bool
+start_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item)
+{
+ bool need_wakeup = false;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ /*
+ * In normal case we should exit sleep loop after last launched
+ * supportive worker passed sync point (status == START_SYNC_POINT_PASSED).
+ * But if we are in SHUTDOWN mode, all launched workers will just exit
+ * sync point whithout status advancing. We can handle such case if we
+ * check that n_participating == n_to_launch.
+ */
+ if (item->status == SHUTDOWN)
+ need_wakeup = (item->nworkers_participating == item->nworkers_to_launch);
+ else
+ need_wakeup = item->status == START_SYNC_POINT_PASSED;
+ }
+ else
+ need_wakeup = (item->status == START_SYNC_POINT_PASSED ||
+ item->status == SHUTDOWN);
+
+ LWLockRelease(AutovacuumLock);
+ return need_wakeup;
+}
+
+static bool
+end_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item)
+{
+ bool need_wakeup = false;
+
+ Assert(AmParallelIdxAutoVacLeader());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ need_wakeup = item->status == END_SYNC_POINT_PASSED;
+ LWLockRelease(AutovacuumLock);
+ return need_wakeup;
+}
+
+/*
+ * Waiting on condition variable is frequent operation, so it has beed taken
+ * out with a separate function. Caller must acquire hold AutovacuumLock before
+ * calling it.
+ */
+static void
+CVSleep(ParallelAutoVacuumWorkItem *item, wakeup_condition wakeup_cond)
+{
+ ConditionVariablePrepareToSleep(&item->cv);
+ LWLockRelease(AutovacuumLock);
+
+ PG_TRY();
+ {
+ do
+ {
+ ConditionVariableSleep(&item->cv, PG_WAIT_IPC);
+ } while (!wakeup_cond(item));
+ }
+ PG_CATCH();
+ {
+ ConditionVariableCancelSleep();
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+
+ ConditionVariableCancelSleep();
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+}
+
+/*
+ * This function used to synchronize leader with supportive workers during
+ * parallel index vacuuming. Each process will exit iff:
+ * Leader worker is ready to perform parallel vacuum &&
+ * All launched supportive workers are ready to perform parallel vacuum &&
+ * (Autovacuum launcher already launched all requested workers ||
+ * Autovacuum launcher cannot launch more workers)
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ *
+ * NOTE: Some workers may call this function when leader worker decided to shut
+ * down parallel vacuuming. In this case '-1' value will be returned.
+ */
+static int
+parallel_autovacuum_start_sync_point(bool keep_lock)
+{
+ SyncType sync_type;
+ int num_participants;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ Assert(MyWorkItem->active);
+ sync_type = MyWorkItem->sync_type;
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(MyWorkItem->leader_proc_pid == MyProcPid);
+
+ /* Wake up all sleeping supportive workers, if required ... */
+ if (sync_type == LEADER)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ * Don't advance if we call this function from error handle function
+ * (status == SHUTDOWN).
+ */
+ if (MyWorkItem->status != SHUTDOWN)
+ MyWorkItem->status = START_SYNC_POINT_PASSED;
+ }
+ /* ... otherwise, wait for somebody to wake us up */
+ else
+ {
+ MyWorkItem->leader_sleeping_on_ssp = true;
+ CVSleep(MyWorkItem, start_sync_point_wakeup_cond);
+ MyWorkItem->leader_sleeping_on_ssp = false;
+
+ /*
+ * A priori, we believe that in the end everyone should be awakened
+ * by the leader.
+ */
+ MyWorkItem->sync_type = LEADER;
+ }
+ }
+ else
+ {
+ MyWorkItem->nworkers_participating += 1;
+
+ /*
+ * If we know, that launcher will no longer attempt to launch more
+ * supportive workers for this item => we are LAST_WORKER for sure.
+ *
+ * Note, that launcher set LAST_WORKER sync type without knowing
+ * current status of leader. So we also check that leader is sleeping
+ * before wake all up. Otherwise, we must wait for leader (and ask him
+ * to wake all up).
+ */
+ if (MyWorkItem->nworkers_participating == MyWorkItem->nworkers_to_launch &&
+ sync_type == LAST_WORKER && MyWorkItem->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+
+ /*
+ * We must not advance status if leader wants to shut down parallel
+ * execution (see checks below).
+ */
+ if (MyWorkItem->status != SHUTDOWN)
+ MyWorkItem->status = START_SYNC_POINT_PASSED;
+ }
+ else
+ {
+ if (MyWorkItem->nworkers_participating == MyWorkItem->nworkers_to_launch &&
+ sync_type == LAST_WORKER)
+ {
+ MyWorkItem->sync_type = LEADER;
+ }
+
+ MyWorkItem->nworkers_sleeping += 1;
+ CVSleep(MyWorkItem, start_sync_point_wakeup_cond);
+ MyWorkItem->nworkers_sleeping -= 1;
+ }
+ }
+
+ /* Tell caller that it must not participate in parallel index cleanup. */
+ if (MyWorkItem->status == SHUTDOWN)
+ num_participants = -1;
+ else
+ num_participants = MyWorkItem->nworkers_participating;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return num_participants;
+}
+
+/*
+ * Like function above, but must be called by leader and supportive workers
+ * when they finished parallel index vacuum.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+ParallelAutovacuumEndSyncPoint(bool keep_lock)
+{
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ Assert(MyWorkItem->active);
+
+ if (MyWorkItem->nworkers_participating == 0)
+ {
+ Assert(!AmParallelIdxAutoVacSupportive());
+
+ /*
+ * We have two cases when no supportive workers were launched:
+ * 1) Leader got MyWorkItem, but launcher didn't launch any
+ * workers => just advance status, because we don't need to wait
+ * for anybody.
+ * 2) Leader didn't get MyWorkItem, because it was already in use =>
+ * we must not touch it. Just leave.
+ */
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(MyWorkItem->leader_proc_pid == MyProcPid);
+ MyWorkItem->status = END_SYNC_POINT_PASSED;
+ }
+ else
+ Assert(MyWorkItem->leader_proc_pid != MyProcPid);
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+ }
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(MyWorkItem->leader_proc_pid == MyProcPid);
+ Assert(MyWorkItem->sync_type == LEADER);
+
+ /* Wait for all workers to finish (only last worker will wake us up) */
+ if (MyWorkItem->nfinished != MyWorkItem->nworkers_participating)
+ {
+ MyWorkItem->sync_type = LAST_WORKER;
+ MyWorkItem->leader_sleeping_on_esp = true;
+ CVSleep(MyWorkItem, end_sync_point_wakeup_cond);
+ MyWorkItem->leader_sleeping_on_esp = false;
+
+ Assert(MyWorkItem->nfinished == MyWorkItem->nworkers_participating);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ */
+ MyWorkItem->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ else
+ {
+ MyWorkItem->nfinished += 1;
+
+ /* If we are last finished worker - wake up the leader.
+ *
+ * If not - just leave, because supportive worker already finished all
+ * work and must die.
+ */
+ if (MyWorkItem->sync_type == LAST_WORKER &&
+ MyWorkItem->nfinished == MyWorkItem->nworkers_participating &&
+ MyWorkItem->leader_sleeping_on_esp)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+
+ /*
+ * Don't need to check SHUTDOWN status here - all supportive workers
+ * are about to finish anyway.
+ */
+ MyWorkItem->status = END_SYNC_POINT_PASSED;
+ }
+
+ /* We are not participate anymore */
+ MyWorkerInfo->wi_pcleanup = -1;
+ MyWorkItem = NULL;
+ }
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+}
+
+/*
+ * Get id of parallel index vacuum worker (counting from 0).
+ */
+int
+GetAutoVacuumParallelWorkerNumber(void)
+{
+ Assert(AmAutoVacuumWorkerProcess() && MyWorkerInfo->wi_pcleanup > 0);
+ return (MyWorkerInfo->wi_pcleanup - 1);
+}
+
+/*
+ * Leader autovacuum process can decide, that he needs several helper workers
+ * to process table in parallel mode. He must set up parallel context and call
+ * LaunchParallelAutovacuumWorkers.
+ *
+ * In this function we do following :
+ * 1) Send signal to autovacuum lancher that creates 'supportive workers'
+ * during launcher's standard work loop.
+ * 2) Wait for supportive workers to start.
+ *
+ * Funcition return number of workers that launcher was able to launch (may be
+ * less then 'nworkers_to_launch').
+ */
+int
+LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle)
+{
+ int nworkers_launched = 0;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (MyWorkItem == NULL)
+ MyWorkItem = get_free_workitem_for_leader();
+
+ if (MyWorkItem == NULL)
+ {
+ LWLockRelease(AutovacuumLock);
+ return -1;
+ }
+
+ /* Notify autovacuum launcher that we need supportive workers */
+ if (AutoVacParallelWorkRequest())
+ {
+ /* OK, we can use this workitem entry. Init it. */
+ MyWorkItem->avw_database = MyDatabaseId;
+ MyWorkItem->avw_relation = rel_id;
+ MyWorkItem->handl = handle;
+ MyWorkItem->leader_proc = MyProc;
+ MyWorkItem->leader_proc_pid = MyProcPid;
+ MyWorkItem->nworkers_participating = 0;
+ MyWorkItem->nworkers_to_launch = nworkers_to_launch;
+ MyWorkItem->leader_sleeping_on_ssp = false;
+ MyWorkItem->leader_sleeping_on_esp = false;
+ MyWorkItem->nworkers_sleeping = 0;
+ MyWorkItem->nfinished = 0;
+ MyWorkItem->sync_type = LAUNCHER;
+ MyWorkItem->status = STARTUP;
+
+ MyWorkItem->active = true;
+ MyWorkItem->needs_launcher = true;
+ MyWorkItem->birthtime = GetCurrentTimestamp();
+ LWLockRelease(AutovacuumLock);
+
+ /* Become the leader */
+ MyWorkerInfo->wi_pcleanup = 0;
+
+ /* All created workers must get same locks as leader process */
+ BecomeLockGroupLeader();
+
+ /*
+ * Wait until all supprotive workers are launched. Also retrieve actual
+ * number of participants
+ */
+
+ nworkers_launched = parallel_autovacuum_start_sync_point(false);
+ Assert(nworkers_launched >= 0);
+ }
+ else
+ {
+ /*
+ * If we (for any reason) cannot send signal to the launcher, don't try
+ * to do index vacuuming in parallel
+ */
+ MyWorkItem = NULL;
+ LWLockRelease(AutovacuumLock);
+ return 0;
+ }
+
+ return nworkers_launched;
+}
+
+/*
+ * During parallel index vacuuming any worker (both supportives and leader) can
+ * catch an error.
+ * In order to handle it in the right way we must call this function.
+ */
+static void
+handle_parallel_idx_autovac_errors(void)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ if (MyWorkItem->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed - just wait for all supportive
+ * workers to finish and exit.
+ */
+ ParallelAutovacuumEndSyncPoint(true);
+ }
+ else if (MyWorkItem->status == STARTUP)
+ {
+ /*
+ * If no sync point are passed we can prevent supportive workers
+ * from performing their work - set SHUTDOWN status and wait while
+ * all workers will see it.
+ */
+ MyWorkItem->status = SHUTDOWN;
+ parallel_autovacuum_start_sync_point(true);
+ }
+
+ AutoVacuumReleaseParallelWork(true);
+ }
+ else
+ {
+ Assert(AmParallelIdxAutoVacSupportive());
+
+ if (MyWorkItem->status == STARTUP || MyWorkItem->status == SHUTDOWN)
+ {
+ /*
+ * If no sync point are passed - just exclude ourselves from
+ * participants. Further parallel index vacuuming will take place
+ * as usual.
+ */
+ MyWorkItem->nworkers_to_launch -= 1;
+
+ if (MyWorkItem->nworkers_participating == MyWorkItem->nworkers_to_launch &&
+ MyWorkItem->sync_type == LAST_WORKER && MyWorkItem->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+
+ if (MyWorkItem->status != SHUTDOWN)
+ MyWorkItem->status = START_SYNC_POINT_PASSED;
+ }
+ }
+ else if (MyWorkItem->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed we will simulate the usual
+ * end of work (see ParallelAutovacuumEndSyncPoint).
+ */
+ MyWorkItem->nfinished += 1;
+
+ /*
+ * We check "!MyWorkItem->leader_sleeping_on_ssp" in order to handle an
+ * almost impossible situation, when leader didn't have time to wake
+ * up after start sync point (but last worker already advenced
+ * status to START_SYNC_POINT_PASSED). In this case we should not
+ * advance status to END_SYNC_POINT_PASSED, so leader can continue
+ * processing.
+ */
+ if (MyWorkItem->sync_type == LAST_WORKER &&
+ MyWorkItem->nfinished == MyWorkItem->nworkers_participating &&
+ !MyWorkItem->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+ MyWorkItem->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3361,6 +4362,12 @@ AutoVacuumShmemInit(void)
AutoVacuumShmem->av_startingWorker = NULL;
memset(AutoVacuumShmem->av_workItems, 0,
sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
+ memset(&AutoVacuumShmem->pav_workItems, 0,
+ sizeof(ParallelAutoVacuumWorkItem) * NUM_WORKITEMS);
+ for (int j = 0; j < NUM_WORKITEMS; j++)
+ ConditionVariableInit(&AutoVacuumShmem->pav_workItems[j].cv);
+
+ AutoVacuumShmem->pav_workers_left = max_parallel_index_autovac_workers;
worker = (WorkerInfo) ((char *) AutoVacuumShmem +
MAXALIGN(sizeof(AutoVacuumShmemStruct)));
@@ -3406,6 +4413,19 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * GUC check_hook for max_parallel_index_autovac_workers
+ */
+bool
+check_max_parallel_index_autovac_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= autovacuum_max_workers)
+ return false;
+ return true;
+}
+
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..00c746bf853 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3647,6 +3647,16 @@ struct config_int ConfigureNamesInt[] =
check_autovacuum_work_mem, NULL, NULL
},
+ {
+ {"max_parallel_index_autovac_workers", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the maximum number of autovacuum workers that can be launched for parallel index processing during autovacuum."),
+ gettext_noop("This parameter limits the total number of such processes per cluster and must be < autovacuum_max_workers"),
+ },
+ &max_parallel_index_autovac_workers,
+ 0, 0, MAX_PARALLEL_WORKER_LIMIT,
+ check_max_parallel_index_autovac_workers, NULL, NULL
+ },
+
{
{"tcp_keepalives_idle", PGC_USERSET, CONN_AUTH_TCP,
gettext_noop("Time between issuing TCP keepalives."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..25c3c4fb258 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -146,6 +146,8 @@
#hash_mem_multiplier = 2.0 # 1-1000.0 multiplier on hash table work_mem
#maintenance_work_mem = 64MB # min 64kB
#autovacuum_work_mem = -1 # min 64kB, or -1 to use maintenance_work_mem
+#max_parallel_index_autovac_workers = 0 # this feature disabled by default
+ # (change requires restart)
#logical_decoding_work_mem = 64MB # min 64kB
#max_stack_depth = 2MB # min 100kB
#shared_memory_type = mmap # the default is the first option
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..bc3e3625a61 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -15,6 +15,8 @@
#define AUTOVACUUM_H
#include "storage/block.h"
+#include "storage/dsm_impl.h"
+#include "storage/lock.h"
/*
* Other processes can request specific work from autovacuum, identified by
@@ -25,12 +27,28 @@ typedef enum
AVW_BRINSummarizeRange,
} AutoVacuumWorkItemType;
+/*
+ * Magic number for parallel context TOC. Used for parallel index processing
+ * during autovacuum.
+ */
+#define AV_PARALLEL_MAGIC 0xaaaaaaaa
+
+/* Magic numbers for per-context parallel index processing state sharing. */
+#define AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT UINT64CONST(0xFFF0000000000001)
+#define AV_PARALLEL_KEY_ACTIVE_SNAPSHOT UINT64CONST(0xFFF0000000000002)
+
+/*
+ * During parallel index processing we want to launch one a/v worker for every
+ * 30 indexes of table.
+ */
+#define AV_PARALLEL_INDEXES_PER_WORKER 30
/* GUC variables */
extern PGDLLIMPORT bool autovacuum_start_daemon;
extern PGDLLIMPORT int autovacuum_worker_slots;
extern PGDLLIMPORT int autovacuum_max_workers;
extern PGDLLIMPORT int autovacuum_work_mem;
+extern PGDLLIMPORT int max_parallel_index_autovac_workers;
extern PGDLLIMPORT int autovacuum_naptime;
extern PGDLLIMPORT int autovacuum_vac_thresh;
extern PGDLLIMPORT int autovacuum_vac_max_thresh;
@@ -58,12 +76,21 @@ extern void autovac_init(void);
/* called from postmaster when a worker could not be forked */
extern void AutoVacWorkerFailed(void);
+/* called from autovac worker when it needs participants in parallel index cleanup */
+extern bool AutoVacParallelWorkRequest(void);
+
pg_noreturn extern void AutoVacLauncherMain(const void *startup_data, size_t startup_data_len);
pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t startup_data_len);
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+extern void AutoVacuumReleaseParallelWork(bool keep_lock);
+extern int AutoVacuumParallelWorkWaitForStart(void);
+extern void ParallelAutovacuumEndSyncPoint( bool keep_lock);
+extern int GetAutoVacuumParallelWorkerNumber(void);
+extern int LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..fb1b52a0ee4 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+bool check_max_parallel_index_autovac_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..c4d378917a2 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -348,7 +348,8 @@ typedef struct StdRdOptions
StdRdOptIndexCleanup vacuum_index_cleanup; /* controls index vacuuming */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
bool vacuum_truncate_set; /* whether vacuum_truncate is set */
-
+ bool parallel_idx_autovac; /* enables autovacuum to process indexes
+ of this table in parallel mode */
/*
* Fraction of pages in a relation that vacuum can eagerly scan and fail
* to freeze. 0 if disabled, -1 if unspecified.
@@ -400,6 +401,16 @@ typedef struct StdRdOptions
(relation)->rd_rel->relkind == RELKIND_MATVIEW) ? \
((StdRdOptions *) (relation)->rd_options)->user_catalog_table : false)
+/*
+ * CanUseParallelIdxAutovacForRelation
+ * Check whether we can process indexes of this relation in paralllel mode
+ * during autovacuum.
+ */
+ #define CanUseParallelIdxAutovacForRelation(relation) \
+ (AssertMacro(RelationIsValid(relation)), \
+ (relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->parallel_idx_autovac : false)
+
/*
* RelationGetParallelWorkers
* Returns the relation's parallel_workers reloption setting.
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..9b3f52c4879
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..5da4226f0d6
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,135 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ autovacuum_max_workers = 10
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 1_000_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_idx_autovac = true);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+my $dead_tuples_thresh = $initial_rows_num / 4;
+my $indexes_num_thresh = $indexes_num / 2;
+my $num_workers = 2;
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_work_mem = 2048
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ max_parallel_index_autovac_workers = $num_workers
+ autovacuum = on
+});
+
+$node->restart;
+
+# wait for autovacuum to reset datfrozenxid age to 0
+$node->poll_query_until('postgres', q{
+ SELECT count(*) = 0 FROM pg_database WHERE mxid_age(datfrozenxid) > 0
+}) or die "Timed out while waiting for autovacuum";
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-15 21:06 Matheus Alcantara <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Matheus Alcantara @ 2025-05-15 21:06 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Masahiko Sawada <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On 09/05/25 15:33, Daniil Davydov wrote:
> Hi,
> As I promised - meet parallel index autovacuum with bgworkers
> (Parallel-index-autovacuum-with-bgworkers.patch). This is pretty
> simple implementation :
> 1) Added new table option `parallel_idx_autovac_enabled` that must be
> set to `true` if user wants autovacuum to process table in parallel.
> 2) Added new GUC variable `autovacuum_reserved_workers_num`. This is
> number of parallel workers from bgworkers pool that can be used only
> by autovacuum workers. The `autovacuum_reserved_workers_num` parameter
> actually reserves a requested part of the processes, the total number
> of which is equal to `max_worker_processes`.
> 3) When an autovacuum worker decides to process some table in
> parallel, it just sets `VacuumParams->nworkers` to appropriate value
> (> 0) and then the code is executed as if it were a regular VACUUM
> PARALLEL.
> 4) I kept test/modules/autovacuum as sandbox where you can play with
> parallel index autovacuum a bit.
>
> What do you think about this implementation?
>
I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is
failing:
❯❯❯ ninja -C build install
ninja: Entering directory `build'
[1/126] Compiling C object
src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible
pointer to integer conversion initializing 'int' with an expression of
type 'void *' [-Wint-conversion]
3613 | NULL,
| ^~~~
It seems that the "autovacuum_reserved_workers_num" declaration on
guc_tables.c has an extra gettext_noop() call?
One other point is that as you've added TAP tests for the autovacuum I
think you also need to create a meson.build file as you already create
the Makefile.
You also need to update the src/test/modules/meson.build and
src/test/modules/Makefile to include the new test/modules/autovacuum
path.
--
Matheus Alcantara
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-16 05:10 Daniil Davydov <[email protected]>
parent: Matheus Alcantara <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-05-16 05:10 UTC (permalink / raw)
To: Matheus Alcantara <[email protected]>; +Cc: Sami Imseih <[email protected]>; Masahiko Sawada <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Fri, May 16, 2025 at 4:06 AM Matheus Alcantara
<[email protected]> wrote:
> I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is
> failing:
> ❯❯❯ ninja -C build install
> ninja: Entering directory `build'
> [1/126] Compiling C object
> src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
> FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
> ../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible
> pointer to integer conversion initializing 'int' with an expression of
> type 'void *' [-Wint-conversion]
> 3613 | NULL,
> | ^~~~
>
Thank you for reviewing this patch!
> It seems that the "autovacuum_reserved_workers_num" declaration on
> guc_tables.c has an extra gettext_noop() call?
Good catch, I fixed this warning in the v2 version.
>
> One other point is that as you've added TAP tests for the autovacuum I
> think you also need to create a meson.build file as you already create
> the Makefile.
>
> You also need to update the src/test/modules/meson.build and
> src/test/modules/Makefile to include the new test/modules/autovacuum
> path.
>
OK, I should clarify this moment : modules/autovacuum is not a normal
test but a sandbox - just an example of how we can trigger parallel
index autovacuum. Also it may be used for debugging purposes.
In fact, 001_autovac_parallel.pl is not verifying anything.
I'll do as you asked (add all meson and Make stuff), but please don't
focus on it. The creation of the real test is still in progress. (I'll
try to complete it as soon as possible).
In this letter I will divide the patch into 2 parts : implementation
and sandbox. What do you think about implementation?
--
Best regards,
Daniil Davydov
Attachments:
[application/x-patch] v2-0002-Sandbox-for-parallel-index-autovacuum.patch (8.6K, 2-v2-0002-Sandbox-for-parallel-index-autovacuum.patch)
download | inline diff:
From 5a25535f5f4212ca756b9c67bcecf3a271ceb215 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v2 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 129 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 158 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..a44cbebe0fd
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,129 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_reserved_workers_num = 1
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_idx_autovac_enabled = true);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
[application/x-patch] v2-0001-Parallel-index-autovacuum-with-bgworkers.patch (16.1K, 3-v2-0001-Parallel-index-autovacuum-with-bgworkers.patch)
download | inline diff:
From c518f1226f8961fdef88600a6d388674e184cff7 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v2 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 11 ++++
src/backend/commands/vacuum.c | 55 +++++++++++++++++++
src/backend/commands/vacuumparallel.c | 46 ++++++++++------
src/backend/postmaster/autovacuum.c | 9 +++
src/backend/postmaster/bgworker.c | 33 ++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 12 ++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 10 ++++
11 files changed, 162 insertions(+), 19 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..ccf59208783 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ {
+ {
+ "parallel_idx_autovac_enabled",
+ "Allows autovacuum to process indexes of this table in parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1863,6 +1872,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_idx_autovac_enabled", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_idx_autovac_enabled)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..f7667f14147 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,21 @@
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
+
/*
* Minimum interval for cost-based vacuum delay reports from a parallel worker.
* This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2246,49 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * If we are running autovacuum - decide whether we need to process indexes
+ * of table with given oid in parallel.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ RelationAllowsParallelIdxAutovac(rel))
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (av_reserved_workers_num > 0)
+ {
+ /*
+ * We request at least one parallel worker, if user set
+ * 'parallel_idx_autovac_enabled' option. The total number of
+ * additional parallel workers depends on how many indexes the
+ * table has. For now we assume that each parallel worker should
+ * process NUM_INDEXES_PER_PARALLEL_WORKER indexes.
+ */
+ params->nworkers =
+ Min((num_indexes / NUM_INDEXES_PER_PARALLEL_WORKER) + 1,
+ av_reserved_workers_num);
+ }
+ else
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("Cannot launch any supportive workers for parallel index cleanup of rel %s",
+ RelationGetRelationName(rel)),
+ errhint("You might need to set parameter \"av_reserved_workers_num\" to a value > 0")));
+
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..e2b3e5b343c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,15 +1,15 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (av_reserved_workers_num == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, av_reserved_workers_num) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -982,8 +996,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4d4a1a3197e..e7e340c4e7c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3406,6 +3406,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_reserved_workers_num(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval > (max_worker_processes - 8))
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..cb86db99da9 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1046,6 +1046,8 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
BackgroundWorkerHandle **handle)
{
int slotno;
+ int from;
+ int upto;
bool success = false;
bool parallel;
uint64 generation = 0;
@@ -1088,10 +1090,23 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
return false;
}
+ /*
+ * Determine range of workers in pool, that we can use (last
+ * 'av_reserved_workers_num' is reserved for autovacuum workers).
+ */
+
+ from = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots - av_reserved_workers_num :
+ 0;
+
+ upto = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots :
+ BackgroundWorkerData->total_slots - av_reserved_workers_num;
+
/*
* Look for an unused slot. If we find one, grab it.
*/
- for (slotno = 0; slotno < BackgroundWorkerData->total_slots; ++slotno)
+ for (slotno = from; slotno < upto; ++slotno)
{
BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
@@ -1159,7 +1174,13 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
BackgroundWorkerSlot *slot;
pid_t pid;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'av_reserved_workers_num' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - av_reserved_workers_num);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - av_reserved_workers_num);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/*
@@ -1298,7 +1319,13 @@ TerminateBackgroundWorker(BackgroundWorkerHandle *handle)
BackgroundWorkerSlot *slot;
bool signal_postmaster = false;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'av_reserved_workers_num' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - av_reserved_workers_num);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - av_reserved_workers_num);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/* Set terminate flag in shared memory, unless slot has been reused. */
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 92b0446b80c..cff13ef6bd7 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -144,6 +144,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int av_reserved_workers_num = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..90b4e9570cf 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_reserved_workers_num", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Number of worker processes, reserved for participation in parallel index processing during autovacuum."),
+ gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+ "*Only* autovacuum workers can use these additional processes. "
+ "Also, these processes are taken into account in \"max_parallel_workers\"."),
+ },
+ &av_reserved_workers_num,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_reserved_workers_num, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..2e38bada2b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -223,6 +223,7 @@
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
+#autovacuum_reserved_workers_num = 0 # disabled by default and limited by max_parallel_workers
#parallel_leader_participation = on
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1e59a7f910f..992c6b63226 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int NBuffers;
extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
+extern PGDLLIMPORT int av_reserved_workers_num;
extern PGDLLIMPORT int max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..9913c6e4681 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+bool check_autovacuum_reserved_workers_num(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..55aa5c45be1 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,7 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ bool parallel_idx_autovac_enabled;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +410,15 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationAllowsParallelIdxAutovac
+ * Returns whether the relation's indexes can be processed in parallel
+ * during autovacuum. Note multiple eval of argument!
+ */
+#define RelationAllowsParallelIdxAutovac(relation) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_idx_autovac_enabled : false)
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-20 22:30 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-05-20 22:30 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Thu, May 15, 2025 at 10:10 PM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Fri, May 16, 2025 at 4:06 AM Matheus Alcantara
> <[email protected]> wrote:
> > I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is
> > failing:
> > ❯❯❯ ninja -C build install
> > ninja: Entering directory `build'
> > [1/126] Compiling C object
> > src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
> > FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
> > ../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible
> > pointer to integer conversion initializing 'int' with an expression of
> > type 'void *' [-Wint-conversion]
> > 3613 | NULL,
> > | ^~~~
> >
>
> Thank you for reviewing this patch!
>
> > It seems that the "autovacuum_reserved_workers_num" declaration on
> > guc_tables.c has an extra gettext_noop() call?
>
> Good catch, I fixed this warning in the v2 version.
>
> >
> > One other point is that as you've added TAP tests for the autovacuum I
> > think you also need to create a meson.build file as you already create
> > the Makefile.
> >
> > You also need to update the src/test/modules/meson.build and
> > src/test/modules/Makefile to include the new test/modules/autovacuum
> > path.
> >
>
> OK, I should clarify this moment : modules/autovacuum is not a normal
> test but a sandbox - just an example of how we can trigger parallel
> index autovacuum. Also it may be used for debugging purposes.
> In fact, 001_autovac_parallel.pl is not verifying anything.
> I'll do as you asked (add all meson and Make stuff), but please don't
> focus on it. The creation of the real test is still in progress. (I'll
> try to complete it as soon as possible).
>
> In this letter I will divide the patch into 2 parts : implementation
> and sandbox. What do you think about implementation?
Thank you for updating the patches. I have some comments on v2-0001 patch:
+ {
+ {"autovacuum_reserved_workers_num", PGC_USERSET,
RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Number of worker processes, reserved for
participation in parallel index processing during autovacuum."),
+ gettext_noop("This parameter is depending on
\"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+ "*Only* autovacuum workers can use these
additional processes. "
+ "Also, these processes are taken into account
in \"max_parallel_workers\"."),
+ },
+ &av_reserved_workers_num,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_reserved_workers_num, NULL, NULL
+ },
I find that the name "autovacuum_reserved_workers_num" is generic. It
would be better to have a more specific name for parallel vacuum such
as autovacuum_max_parallel_workers. This parameter is related to
neither autovacuum_worker_slots nor autovacuum_max_workers, which
seems fine to me. Also, max_parallel_maintenance_workers doesn't
affect this parameter.
Which number does this parameter mean to specify: the maximum number
of parallel vacuum workers that can be used during autovacuum or the
maximum number of parallel vacuum workers that each autovacuum can
use?
---
The patch includes the changes to bgworker.c so that we can reserve
some slots for autovacuums. I guess that this change is not
necessarily necessary because if the user sets the related GUC
parameters correctly the autovacuum workers can use parallel vacuum as
expected. Even if we need this change, I would suggest implementing
it as a separate patch.
---
+ {
+ {
+ "parallel_idx_autovac_enabled",
+ "Allows autovacuum to process indexes of this table in
parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
The proposed reloption name doesn't align with our naming conventions.
Looking at our existing reloptions, we typically write out full words
rather than using abbreviations like 'autovac' or 'idx'.
The new reloption name seems not to follow the conventional naming
style for existing reloption. For instance, we don't use abbreviations
such as 'autovac' and 'idx'.
I guess we can implement this parameter as an integer parameter so
that the user can specify the number of parallel vacuum workers for
the table. For example, we can have a reloption
autovacuum_parallel_workers. Setting 0 (by default) means to disable
parallel vacuum during autovacuum, and setting special value -1 means
to let PostgreSQL calculate the parallel degree for the table (same as
the default VACUUM command behavior).
I've also considered some alternative names. If we were to use
parallel_maintenance_workers, it sounds like it controls the parallel
degree for all operations using max_parallel_maintenance_workers,
including CREATE INDEX. Similarly, vacuum_parallel_workers could be
interpreted as affecting both autovacuum and manual VACUUM commands,
suggesting that when users run "VACUUM (PARALLEL) t", the system would
use their specified value for the parallel degree. I prefer
autovacuum_parallel_workers or vacuum_parallel_workers.
---
+ /*
+ * If we are running autovacuum - decide whether we need to process indexes
+ * of table with given oid in parallel.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ RelationAllowsParallelIdxAutovac(rel))
I think that this should be done in autovacuum code.
---
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
These fixed values really useful in common cases? I think we already
have an optimization where we skip vacuum indexes if the table has
fewer dead tuples (see BYPASS_THRESHOLD_PAGES). Given that we rely on
users' heuristics which table needs to use parallel vacuum during
autovacuum, I think we don't need to apply these conditions.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-22 07:43 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-05-22 07:43 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <[email protected]> wrote:
>
> I have some comments on v2-0001 patch
Thank you for reviewing this patch!
> + {
> + {"autovacuum_reserved_workers_num", PGC_USERSET,
> RESOURCES_WORKER_PROCESSES,
> + gettext_noop("Number of worker processes, reserved for
> participation in parallel index processing during autovacuum."),
> + gettext_noop("This parameter is depending on
> \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
> + "*Only* autovacuum workers can use these
> additional processes. "
> + "Also, these processes are taken into account
> in \"max_parallel_workers\"."),
> + },
> + &av_reserved_workers_num,
> + 0, 0, MAX_BACKENDS,
> + check_autovacuum_reserved_workers_num, NULL, NULL
> + },
>
> I find that the name "autovacuum_reserved_workers_num" is generic. It
> would be better to have a more specific name for parallel vacuum such
> as autovacuum_max_parallel_workers. This parameter is related to
> neither autovacuum_worker_slots nor autovacuum_max_workers, which
> seems fine to me. Also, max_parallel_maintenance_workers doesn't
> affect this parameter.
> .......
> I've also considered some alternative names. If we were to use
> parallel_maintenance_workers, it sounds like it controls the parallel
> degree for all operations using max_parallel_maintenance_workers,
> including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> interpreted as affecting both autovacuum and manual VACUUM commands,
> suggesting that when users run "VACUUM (PARALLEL) t", the system would
> use their specified value for the parallel degree. I prefer
> autovacuum_parallel_workers or vacuum_parallel_workers.
>
This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers. So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.
> Which number does this parameter mean to specify: the maximum number
> of parallel vacuum workers that can be used during autovacuum or the
> maximum number of parallel vacuum workers that each autovacuum can
> use?
First variant. I will concrete this in the variable's description.
> + {
> + {
> + "parallel_idx_autovac_enabled",
> + "Allows autovacuum to process indexes of this table in
> parallel mode",
> + RELOPT_KIND_HEAP,
> + ShareUpdateExclusiveLock
> + },
> + false
> + },
>
> The proposed reloption name doesn't align with our naming conventions.
> Looking at our existing reloptions, we typically write out full words
> rather than using abbreviations like 'autovac' or 'idx'.
>
> The new reloption name seems not to follow the conventional naming
> style for existing reloption. For instance, we don't use abbreviations
> such as 'autovac' and 'idx'.
OK, I'll fix it.
> + /*
> + * If we are running autovacuum - decide whether we need to process indexes
> + * of table with given oid in parallel.
> + */
> + if (AmAutoVacuumWorkerProcess() &&
> + params->index_cleanup != VACOPTVALUE_DISABLED &&
> + RelationAllowsParallelIdxAutovac(rel))
>
> I think that this should be done in autovacuum code.
We need params->index cleanup variable to decide whether we need to
use parallel index a/v. In autovacuum.c we have this code :
***
/*
* index_cleanup and truncate are unspecified at first in autovacuum.
* They will be filled in with usable values using their reloptions
* (or reloption defaults) later.
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
***
This variable is filled in inside the `vacuum_rel` function, so I
think we should keep the above logic in vacuum.c.
> +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
>
> These fixed values really useful in common cases? I think we already
> have an optimization where we skip vacuum indexes if the table has
> fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
When we allocate dead items (and optionally init parallel autocuum) we
don't have sane value for `vacrel->lpdead_item_pages` (which should be
compared with BYPASS_THRESHOLD_PAGES).
The only criterion that we can focus on is the number of dead tuples
indicated in the PgStat_StatTabEntry.
----
> I guess we can implement this parameter as an integer parameter so
> that the user can specify the number of parallel vacuum workers for
> the table. For example, we can have a reloption
> autovacuum_parallel_workers. Setting 0 (by default) means to disable
> parallel vacuum during autovacuum, and setting special value -1 means
> to let PostgreSQL calculate the parallel degree for the table (same as
> the default VACUUM command behavior).
> ...........
> The patch includes the changes to bgworker.c so that we can reserve
> some slots for autovacuums. I guess that this change is not
> necessarily necessary because if the user sets the related GUC
> parameters correctly the autovacuum workers can use parallel vacuum as
> expected. Even if we need this change, I would suggest implementing
> it as a separate patch.
> ..........
> +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
> +#define NUM_INDEXES_PER_PARALLEL_WORKER 30
>
> These fixed values really useful in common cases? Given that we rely on
> users' heuristics which table needs to use parallel vacuum during
> autovacuum, I think we don't need to apply these conditions.
> ..........
I grouped these comments together, because they all relate to a single
question : how much freedom will we give to the user?
Your opinion (as far as I understand) is that we allow users to
specify any number of parallel workers for tables, and it is the
user's responsibility to configure appropriate GUC variables, so that
autovacuum can always process indexes in parallel.
And we don't need to think about thresholds. Even if the table has a
small number of indexes and dead rows - if the user specified table
option, we must do a parallel index a/v with requested number of
parallel workers.
Please correct me if I messed something up.
I think that this logic is well suited for the `VACUUM (PARALLEL)` sql
command, which is manually called by the user.
But autovacuum (as I think) should work as stable as possible and
`unnoticed` by other processes. Thus, we must :
1) Compute resources (such as the number of parallel workers for a
single table's indexes vacuuming) as efficiently as possible.
2) Provide a guarantee that as many tables as possible (among
requested) will be processed in parallel.
(1) can be achieved by calculating the parameters on the fly.
NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
accurate value in the near future.
(2) can be achieved by workers reserving - we know that N workers
(from bgworkers pool) are *always* at our disposal. And when we use
such workers we are not dependent on other operations in the cluster
and we don't interfere with other operations by taking resources away
from them.
If we give the user too much freedom in parallel index a/v tuning, all
these requirements may be violated.
This is only my opinion, and I can agree with yours. Maybe we need
another person to judge us?
Please see v3 patches that contain changes related to GUC parameter
and table option (no changes in global logic by now).
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v3-0001-Parallel-index-autovacuum-with-bgworkers.patch (16.7K, 2-v3-0001-Parallel-index-autovacuum-with-bgworkers.patch)
download | inline diff:
From 2223da7a9b2ef8c8d71780ad72b24eaf6d6c1141 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v3 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 11 ++++
src/backend/commands/vacuum.c | 55 +++++++++++++++++++
src/backend/commands/vacuumparallel.c | 46 ++++++++++------
src/backend/postmaster/autovacuum.c | 14 ++++-
src/backend/postmaster/bgworker.c | 33 ++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 12 ++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 10 ++++
11 files changed, 166 insertions(+), 20 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..730096002b1 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ {
+ {
+ "parallel_index_autovacuum_enabled",
+ "Allows autovacuum to process indexes of this table in parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1863,6 +1872,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_index_autovacuum_enabled", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_index_autovacuum_enabled)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..6c2f49f203f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,21 @@
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
+
/*
* Minimum interval for cost-based vacuum delay reports from a parallel worker.
* This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2246,49 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * If we are running autovacuum - decide whether we need to process indexes
+ * of table with given oid in parallel.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ RelationAllowsParallelIdxAutovac(rel))
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (pia_reserved_workers > 0)
+ {
+ /*
+ * We request at least one parallel worker, if user set
+ * 'parallel_idx_autovac_enabled' option. The total number of
+ * additional parallel workers depends on how many indexes the
+ * table has. For now we assume that each parallel worker should
+ * process NUM_INDEXES_PER_PARALLEL_WORKER indexes.
+ */
+ params->nworkers =
+ Min((num_indexes / NUM_INDEXES_PER_PARALLEL_WORKER) + 1,
+ pia_reserved_workers);
+ }
+ else
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("Cannot launch any supportive workers for parallel index cleanup of rel %s",
+ RelationGetRelationName(rel)),
+ errhint("You might need to set parameter \"pia_reserved_workers\" to a value > 0")));
+
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..5c48a1e740e 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,15 +1,15 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (pia_reserved_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, pia_reserved_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -982,8 +996,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4d4a1a3197e..59fb52aa443 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2824,7 +2824,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
+ /*
+ * Don't request parallel mode by now. nworkers might be set to
+ * positive value if we will meet appropriate for parallel index
+ * processing table.
+ */
tab->at_params.nworkers = -1;
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
@@ -3406,6 +3410,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_pia_reserved_workers(int *newval, void **extra, GucSource source)
+{
+ if (*newval > (max_worker_processes - 8))
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..e62076939ec 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1046,6 +1046,8 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
BackgroundWorkerHandle **handle)
{
int slotno;
+ int from;
+ int upto;
bool success = false;
bool parallel;
uint64 generation = 0;
@@ -1088,10 +1090,23 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
return false;
}
+ /*
+ * Determine range of workers in pool, that we can use (last
+ * 'pia_reserved_workers' is reserved for autovacuum workers).
+ */
+
+ from = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots - pia_reserved_workers :
+ 0;
+
+ upto = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots :
+ BackgroundWorkerData->total_slots - pia_reserved_workers;
+
/*
* Look for an unused slot. If we find one, grab it.
*/
- for (slotno = 0; slotno < BackgroundWorkerData->total_slots; ++slotno)
+ for (slotno = from; slotno < upto; ++slotno)
{
BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
@@ -1159,7 +1174,13 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
BackgroundWorkerSlot *slot;
pid_t pid;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'pia_reserved_workers' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - pia_reserved_workers);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - pia_reserved_workers);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/*
@@ -1298,7 +1319,13 @@ TerminateBackgroundWorker(BackgroundWorkerHandle *handle)
BackgroundWorkerSlot *slot;
bool signal_postmaster = false;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'pia_reserved_workers' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - pia_reserved_workers);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - pia_reserved_workers);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/* Set terminate flag in shared memory, unless slot has been reused. */
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 92b0446b80c..a6fdcd2de5b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -144,6 +144,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int pia_reserved_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..dfc18095d7b 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"parallel_index_autovacuum_reserved_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Maximum number of worker processes (from bgworkers pool), reserved for participation in parallel index autovacuum."),
+ gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+ "*Only* autovacuum workers can use these supportive processes. "
+ "Also, these processes are taken into account in \"max_parallel_workers\"."),
+ },
+ &pia_reserved_workers,
+ 0, 0, MAX_BACKENDS,
+ check_pia_reserved_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..3d96af1547f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -223,6 +223,7 @@
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
+#parallel_index_autovacuum_reserved_workers = 0 # disabled by default and limited by max_parallel_workers
#parallel_leader_participation = on
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1e59a7f910f..465dfe25009 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int NBuffers;
extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
+extern PGDLLIMPORT int pia_reserved_workers;
extern PGDLLIMPORT int max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..8507f95b2ea 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_pia_reserved_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..980c3459469 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,7 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ bool parallel_index_autovacuum_enabled;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +410,15 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationAllowsParallelIdxAutovac
+ * Returns whether the relation's indexes can be processed in parallel
+ * during autovacuum. Note multiple eval of argument!
+ */
+#define RelationAllowsParallelIdxAutovac(relation) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_index_autovacuum_enabled : false)
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
--
2.43.0
[text/x-patch] v3-0002-Sandbox-for-parallel-index-autovacuum.patch (8.6K, 3-v3-0002-Sandbox-for-parallel-index-autovacuum.patch)
download | inline diff:
From d17a01ef2ace5fc6cfd1d22930454d90cfbe63dd Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v3 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 129 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 158 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..5aea3f10e38
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,129 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ parallel_index_autovacuum_reserved_workers = 1
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_idx_autovac_enabled = true);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-22 17:48 Sami Imseih <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Sami Imseih @ 2025-05-22 17:48 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
I started looking at the patch but I have some high level thoughts I would
like to share before looking further.
> > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > would be better to have a more specific name for parallel vacuum such
> > as autovacuum_max_parallel_workers. This parameter is related to
> > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > affect this parameter.
> > .......
> > I've also considered some alternative names. If we were to use
> > parallel_maintenance_workers, it sounds like it controls the parallel
> > degree for all operations using max_parallel_maintenance_workers,
> > including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> > interpreted as affecting both autovacuum and manual VACUUM commands,
> > suggesting that when users run "VACUUM (PARALLEL) t", the system would
> > use their specified value for the parallel degree. I prefer
> > autovacuum_parallel_workers or vacuum_parallel_workers.
> >
>
> This was my headache when I created names for variables. Autovacuum
> initially implies parallelism, because we have several parallel a/v
> workers. So I think that parameter like
> `autovacuum_max_parallel_workers` will confuse somebody.
> If we want to have a more specific name, I would prefer
> `max_parallel_index_autovacuum_workers`.
I don't think we should have a separate pool of parallel workers for those
that are used to support parallel autovacuum. At the end of the day, these
are parallel workers and they should be capped by max_parallel_workers. I think
it will be confusing if we claim these are parallel workers, but they
are coming from
a different pool.
I envision we have another GUC such as "max_parallel_autovacuum_workers"
(which I think is a better name) that matches the behavior of
"max_parallel_maintenance_worker". Meaning that the autovacuum workers
still maintain their existing behavior ( launching a worker per table
), and if they do need
to vacuum in parallel, they can draw from a pool of parallel workers.
With the above said, I therefore think the reloption should actually be a number
of parallel workers rather than a boolean. Let's take an example of a
user that has 3 tables
they wish to (auto)vacuum can process in parallel, and if available
they wish each of these tables
could be autovacuumed with 4 parallel workers. However, as to not
overload the system, they
cap the 'max_parallel_maintenance_worker' to something like 8. If it
so happens that all
3 tables are auto-vacuumed at the same time, there may not be enough
parallel workers,
so one table will be a loser and be vacuumed in serial. That is
acceptable, and a/v logging
( and perhaps other stat views ) should display this behavior: workers
planned vs workers launched.
thoughts?
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-22 23:12 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-05-22 23:12 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <[email protected]> wrote:
> >
> > I have some comments on v2-0001 patch
>
> Thank you for reviewing this patch!
>
> > + {
> > + {"autovacuum_reserved_workers_num", PGC_USERSET,
> > RESOURCES_WORKER_PROCESSES,
> > + gettext_noop("Number of worker processes, reserved for
> > participation in parallel index processing during autovacuum."),
> > + gettext_noop("This parameter is depending on
> > \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
> > + "*Only* autovacuum workers can use these
> > additional processes. "
> > + "Also, these processes are taken into account
> > in \"max_parallel_workers\"."),
> > + },
> > + &av_reserved_workers_num,
> > + 0, 0, MAX_BACKENDS,
> > + check_autovacuum_reserved_workers_num, NULL, NULL
> > + },
> >
> > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > would be better to have a more specific name for parallel vacuum such
> > as autovacuum_max_parallel_workers. This parameter is related to
> > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > affect this parameter.
> > .......
> > I've also considered some alternative names. If we were to use
> > parallel_maintenance_workers, it sounds like it controls the parallel
> > degree for all operations using max_parallel_maintenance_workers,
> > including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> > interpreted as affecting both autovacuum and manual VACUUM commands,
> > suggesting that when users run "VACUUM (PARALLEL) t", the system would
> > use their specified value for the parallel degree. I prefer
> > autovacuum_parallel_workers or vacuum_parallel_workers.
> >
>
> This was my headache when I created names for variables. Autovacuum
> initially implies parallelism, because we have several parallel a/v
> workers.
I'm not sure if it's parallelism. We can have multiple autovacuum
workers simultaneously working on different tables, which seems not
parallelism to me.
> So I think that parameter like
> `autovacuum_max_parallel_workers` will confuse somebody.
> If we want to have a more specific name, I would prefer
> `max_parallel_index_autovacuum_workers`.
It's better not to use 'index' as we're trying to extend parallel
vacuum to heap scanning/vacuuming as well[1].
>
> > + /*
> > + * If we are running autovacuum - decide whether we need to process indexes
> > + * of table with given oid in parallel.
> > + */
> > + if (AmAutoVacuumWorkerProcess() &&
> > + params->index_cleanup != VACOPTVALUE_DISABLED &&
> > + RelationAllowsParallelIdxAutovac(rel))
> >
> > I think that this should be done in autovacuum code.
>
> We need params->index cleanup variable to decide whether we need to
> use parallel index a/v. In autovacuum.c we have this code :
> ***
> /*
> * index_cleanup and truncate are unspecified at first in autovacuum.
> * They will be filled in with usable values using their reloptions
> * (or reloption defaults) later.
> */
> tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
> tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
> ***
> This variable is filled in inside the `vacuum_rel` function, so I
> think we should keep the above logic in vacuum.c.
I guess that we can specify the parallel degree even if index_cleanup
is still UNSPECIFIED. vacuum_rel() would then decide whether to use
index vacuuming and vacuumlazy.c would decide whether to use parallel
vacuum based on the specified parallel degree and index_cleanup value.
>
> > +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
> >
> > These fixed values really useful in common cases? I think we already
> > have an optimization where we skip vacuum indexes if the table has
> > fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
>
> When we allocate dead items (and optionally init parallel autocuum) we
> don't have sane value for `vacrel->lpdead_item_pages` (which should be
> compared with BYPASS_THRESHOLD_PAGES).
> The only criterion that we can focus on is the number of dead tuples
> indicated in the PgStat_StatTabEntry.
My point is that this criterion might not be useful. We have the
bypass optimization for index vacuuming and having many dead tuples
doesn't necessarily mean index vacuuming taking a long time. For
example, even if the table has a few dead tuples, index vacuuming
could take a very long time and parallel index vacuuming would help
the situation, if the table is very large and has many indexes.
>
> ----
>
> > I guess we can implement this parameter as an integer parameter so
> > that the user can specify the number of parallel vacuum workers for
> > the table. For example, we can have a reloption
> > autovacuum_parallel_workers. Setting 0 (by default) means to disable
> > parallel vacuum during autovacuum, and setting special value -1 means
> > to let PostgreSQL calculate the parallel degree for the table (same as
> > the default VACUUM command behavior).
> > ...........
> > The patch includes the changes to bgworker.c so that we can reserve
> > some slots for autovacuums. I guess that this change is not
> > necessarily necessary because if the user sets the related GUC
> > parameters correctly the autovacuum workers can use parallel vacuum as
> > expected. Even if we need this change, I would suggest implementing
> > it as a separate patch.
> > ..........
> > +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
> > +#define NUM_INDEXES_PER_PARALLEL_WORKER 30
> >
> > These fixed values really useful in common cases? Given that we rely on
> > users' heuristics which table needs to use parallel vacuum during
> > autovacuum, I think we don't need to apply these conditions.
> > ..........
>
> I grouped these comments together, because they all relate to a single
> question : how much freedom will we give to the user?
> Your opinion (as far as I understand) is that we allow users to
> specify any number of parallel workers for tables, and it is the
> user's responsibility to configure appropriate GUC variables, so that
> autovacuum can always process indexes in parallel.
> And we don't need to think about thresholds. Even if the table has a
> small number of indexes and dead rows - if the user specified table
> option, we must do a parallel index a/v with requested number of
> parallel workers.
> Please correct me if I messed something up.
>
> I think that this logic is well suited for the `VACUUM (PARALLEL)` sql
> command, which is manually called by the user.
The current idea that users can use parallel vacuum on particular
tables based on their heuristic makes sense to me as the first
implementation.
> But autovacuum (as I think) should work as stable as possible and
> `unnoticed` by other processes. Thus, we must :
> 1) Compute resources (such as the number of parallel workers for a
> single table's indexes vacuuming) as efficiently as possible.
> 2) Provide a guarantee that as many tables as possible (among
> requested) will be processed in parallel.
I think these ideas could be implemented on top of the current idea.
> (1) can be achieved by calculating the parameters on the fly.
> NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
> accurate value in the near future.
I think it requires more things than the number of indexes on the
table to achieve (1). Suppose that there is a very large table that
gets updates heavily and has a few indexes. If users want to avoid the
table from being bloated, it would be a reasonable idea to use
parallel vacuum during autovacuum and it would not be a good idea to
disallow using parallel vacuum solely because it doesn't have more
than 30 indexes. On the other hand, if the table had got many updates
but not so now, users might want to use resources for autovacuums on
other tables. We might need to consider autovacuum frequencies per
table, the statistics of the previous autovacuum, or system loads etc.
So I think that in order to achieve (1) we might need more statistics
and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.
> (2) can be achieved by workers reserving - we know that N workers
> (from bgworkers pool) are *always* at our disposal. And when we use
> such workers we are not dependent on other operations in the cluster
> and we don't interfere with other operations by taking resources away
> from them.
Reserving some bgworkers for autovacuum could make sense. But I think
it's better to implement it in a general way as it could be useful in
other use cases too. That is, it might be a good to implement
infrastructure so that any PostgreSQL code (possibly including
extensions) can request allocating a pool of bgworkers for specific
usage and use bgworkers from them.
Regards,
[1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.g...
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-22 23:20 Masahiko Sawada <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Masahiko Sawada @ 2025-05-22 23:20 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Thu, May 22, 2025 at 10:48 AM Sami Imseih <[email protected]> wrote:
>
> I started looking at the patch but I have some high level thoughts I would
> like to share before looking further.
>
> > > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > > would be better to have a more specific name for parallel vacuum such
> > > as autovacuum_max_parallel_workers. This parameter is related to
> > > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > > affect this parameter.
> > > .......
> > > I've also considered some alternative names. If we were to use
> > > parallel_maintenance_workers, it sounds like it controls the parallel
> > > degree for all operations using max_parallel_maintenance_workers,
> > > including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> > > interpreted as affecting both autovacuum and manual VACUUM commands,
> > > suggesting that when users run "VACUUM (PARALLEL) t", the system would
> > > use their specified value for the parallel degree. I prefer
> > > autovacuum_parallel_workers or vacuum_parallel_workers.
> > >
> >
> > This was my headache when I created names for variables. Autovacuum
> > initially implies parallelism, because we have several parallel a/v
> > workers. So I think that parameter like
> > `autovacuum_max_parallel_workers` will confuse somebody.
> > If we want to have a more specific name, I would prefer
> > `max_parallel_index_autovacuum_workers`.
>
> I don't think we should have a separate pool of parallel workers for those
> that are used to support parallel autovacuum. At the end of the day, these
> are parallel workers and they should be capped by max_parallel_workers. I think
> it will be confusing if we claim these are parallel workers, but they
> are coming from
> a different pool.
I agree that parallel vacuum workers used during autovacuum should be
capped by the max_parallel_workers.
>
> I envision we have another GUC such as "max_parallel_autovacuum_workers"
> (which I think is a better name) that matches the behavior of
> "max_parallel_maintenance_worker". Meaning that the autovacuum workers
> still maintain their existing behavior ( launching a worker per table
> ), and if they do need
> to vacuum in parallel, they can draw from a pool of parallel workers.
>
> With the above said, I therefore think the reloption should actually be a number
> of parallel workers rather than a boolean. Let's take an example of a
> user that has 3 tables
> they wish to (auto)vacuum can process in parallel, and if available
> they wish each of these tables
> could be autovacuumed with 4 parallel workers. However, as to not
> overload the system, they
> cap the 'max_parallel_maintenance_worker' to something like 8. If it
> so happens that all
> 3 tables are auto-vacuumed at the same time, there may not be enough
> parallel workers,
> so one table will be a loser and be vacuumed in serial.
+1 for the reloption having a number of parallel workers, leaving
aside the name competition.
> That is
> acceptable, and a/v logging
> ( and perhaps other stat views ) should display this behavior: workers
> planned vs workers launched.
Agreed. The workers planned vs. launched is reported only with VERBOSE
option so we need to change it so that autovacuum can log it at least.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-05-25 17:22 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-05-25 17:22 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <[email protected]> wrote:
>
> On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <[email protected]> wrote:
> >
> > On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <[email protected]> wrote:
> > >
> > > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > > would be better to have a more specific name for parallel vacuum such
> > > as autovacuum_max_parallel_workers. This parameter is related to
> > > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > > affect this parameter.
> >
> > This was my headache when I created names for variables. Autovacuum
> > initially implies parallelism, because we have several parallel a/v
> > workers.
>
> I'm not sure if it's parallelism. We can have multiple autovacuum
> workers simultaneously working on different tables, which seems not
> parallelism to me.
Hm, I didn't thought about the 'parallelism' definition in this way.
But I see your point - the next v4 patch will contain the naming that
you suggest.
>
> > So I think that parameter like
> > `autovacuum_max_parallel_workers` will confuse somebody.
> > If we want to have a more specific name, I would prefer
> > `max_parallel_index_autovacuum_workers`.
>
> It's better not to use 'index' as we're trying to extend parallel
> vacuum to heap scanning/vacuuming as well[1].
OK, I'll fix it.
> > > + /*
> > > + * If we are running autovacuum - decide whether we need to process indexes
> > > + * of table with given oid in parallel.
> > > + */
> > > + if (AmAutoVacuumWorkerProcess() &&
> > > + params->index_cleanup != VACOPTVALUE_DISABLED &&
> > > + RelationAllowsParallelIdxAutovac(rel))
> > >
> > > I think that this should be done in autovacuum code.
> >
> > We need params->index cleanup variable to decide whether we need to
> > use parallel index a/v. In autovacuum.c we have this code :
> > ***
> > /*
> > * index_cleanup and truncate are unspecified at first in autovacuum.
> > * They will be filled in with usable values using their reloptions
> > * (or reloption defaults) later.
> > */
> > tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
> > tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
> > ***
> > This variable is filled in inside the `vacuum_rel` function, so I
> > think we should keep the above logic in vacuum.c.
>
> I guess that we can specify the parallel degree even if index_cleanup
> is still UNSPECIFIED. vacuum_rel() would then decide whether to use
> index vacuuming and vacuumlazy.c would decide whether to use parallel
> vacuum based on the specified parallel degree and index_cleanup value.
>
> >
> > > +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
> > >
> > > These fixed values really useful in common cases? I think we already
> > > have an optimization where we skip vacuum indexes if the table has
> > > fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
> >
> > When we allocate dead items (and optionally init parallel autocuum) we
> > don't have sane value for `vacrel->lpdead_item_pages` (which should be
> > compared with BYPASS_THRESHOLD_PAGES).
> > The only criterion that we can focus on is the number of dead tuples
> > indicated in the PgStat_StatTabEntry.
>
> My point is that this criterion might not be useful. We have the
> bypass optimization for index vacuuming and having many dead tuples
> doesn't necessarily mean index vacuuming taking a long time. For
> example, even if the table has a few dead tuples, index vacuuming
> could take a very long time and parallel index vacuuming would help
> the situation, if the table is very large and has many indexes.
That sounds reasonable. I'll fix it.
> > But autovacuum (as I think) should work as stable as possible and
> > `unnoticed` by other processes. Thus, we must :
> > 1) Compute resources (such as the number of parallel workers for a
> > single table's indexes vacuuming) as efficiently as possible.
> > 2) Provide a guarantee that as many tables as possible (among
> > requested) will be processed in parallel.
> >
> > (1) can be achieved by calculating the parameters on the fly.
> > NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
> > accurate value in the near future.
>
> I think it requires more things than the number of indexes on the
> table to achieve (1). Suppose that there is a very large table that
> gets updates heavily and has a few indexes. If users want to avoid the
> table from being bloated, it would be a reasonable idea to use
> parallel vacuum during autovacuum and it would not be a good idea to
> disallow using parallel vacuum solely because it doesn't have more
> than 30 indexes. On the other hand, if the table had got many updates
> but not so now, users might want to use resources for autovacuums on
> other tables. We might need to consider autovacuum frequencies per
> table, the statistics of the previous autovacuum, or system loads etc.
> So I think that in order to achieve (1) we might need more statistics
> and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.
>
It's hard for me to imagine exactly how extended statistics will help
us track such situations.
It seems that for any of our heuristics, it will be possible to come
up with a counter example.
Maybe we can give advices (via logs) to the user? But for such an
idea, tests should be conducted so that we can understand when
resource consumption becomes ineffective.
I guess that we need to agree on an implementation before conducting such tests.
> > (2) can be achieved by workers reserving - we know that N workers
> > (from bgworkers pool) are *always* at our disposal. And when we use
> > such workers we are not dependent on other operations in the cluster
> > and we don't interfere with other operations by taking resources away
> > from them.
>
> Reserving some bgworkers for autovacuum could make sense. But I think
> it's better to implement it in a general way as it could be useful in
> other use cases too. That is, it might be a good to implement
> infrastructure so that any PostgreSQL code (possibly including
> extensions) can request allocating a pool of bgworkers for specific
> usage and use bgworkers from them.
Reserving infrastructure is an ambitious idea. I am not sure that we
should implement it within this thread and feature.
Maybe we should create a separate thread for it and as a
justification, refer to parallel autovacuum?
-----
Thanks everybody for feedback! I attach a v4 patch to this letter.
Main features :
1) 'parallel_autovacuum_workers' reloption - integer value, that sets
the maximum number of parallel a/v workers that can be taken from
bgworkers pool in order to process this table.
2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
maximum total number of parallel a/v workers, that can be taken from
bgworkers pool.
3) Parallel autovacuum does not try to use thresholds like
NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
4) Parallel autovacuum now can report statistics like "planned vs. launched".
5) For now I got rid of the 'reserving' idea, so now autovacuum
leaders are competing with everyone for parallel workers from the
bgworkers pool.
What do you think about this implementation?
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v4-0001-Parallel-index-autovacuum-with-bgworkers.patch (20.6K, 2-v4-0001-Parallel-index-autovacuum-with-bgworkers.patch)
download | inline diff:
From afa3f4c3d8993b775837cd04e5d170012b9d2691 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v4 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 12 +++
src/backend/access/heap/vacuumlazy.c | 6 +-
src/backend/access/transam/parallel.c | 11 +++
src/backend/commands/vacuumparallel.c | 76 +++++++++++++------
src/backend/postmaster/autovacuum.c | 76 ++++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 +++
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 12 +++
12 files changed, 186 insertions(+), 27 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..6ba8da62546 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1863,6 +1873,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f28326bad09..2614ceba139 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3487,6 +3487,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
+
/*
* Initialize state for a parallel vacuum. As of now, only one worker can
* be used for an index, so we invoke parallelism only if there are at
@@ -3513,7 +3517,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
vacrel->nindexes, nworkers,
vac_work_mem,
- vacrel->verbose ? INFO : DEBUG2,
+ elevel,
vacrel->bstrategy);
/*
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 94db1ec3012..d3313774a4b 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -34,6 +34,7 @@
#include "miscadmin.h"
#include "optimizer/optimizer.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/ipc.h"
#include "storage/predicate.h"
#include "storage/spin.h"
@@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
{
WaitForParallelWorkersToFinish(pcxt);
WaitForParallelWorkersToExit(pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
pcxt->nworkers_launched = 0;
if (pcxt->known_attached_workers)
{
@@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt)
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
RESUME_INTERRUPTS();
/* Free the worker array itself. */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..c63830fd2a5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,16 +1,16 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
- * launch parallel worker processes at the start of parallel index
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one [auto]vacuum process. ParallelVacuumState contains shared information
+ * as well as the memory space for storing dead items allocated in the DSA area.
+ * We launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -541,7 +551,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*
* nrequested is the number of parallel workers that user requested. If
* nrequested is 0, we compute the parallel degree based on nindexes, that is
- * the number of indexes that support parallel vacuum. This function also
+ * the number of indexes that support parallel [auto]vacuum. This function also
* sets will_parallel_vacuum to remember indexes that participate in parallel
* vacuum.
*/
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (max_parallel_autovacuum_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, max_parallel_autovacuum_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -666,6 +680,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Reset the parallel index processing and progress counters */
pg_atomic_write_u32(&(pvs->shared->idx), 0);
+ /* Check how many workers can provide autovacuum. */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
/* Setup the shared cost-based vacuum delay and launch workers */
if (nworkers > 0)
{
@@ -690,6 +708,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched - nworkers);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -706,16 +734,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (vacuum)
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
- "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)",
+ "launched %d parallel %svacuum workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
else
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
- "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index cleanup (planned: %d)",
+ "launched %d parallel %svacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
}
/* Vacuum the indexes that can be processed by only leader process */
@@ -982,8 +1010,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 981be42e3af..7f34e202589 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_active_parallel_workers the number of active parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_active_parallel_workers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -2840,8 +2842,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->parallel_autovacuum_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3322,6 +3328,61 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'max_parallel_autovacuum_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_active_parallel_workers < nworkers)
+ {
+ /* Provide as many workers as we can. */
+ can_launch = AutoVacuumShmem->av_active_parallel_workers;
+ AutoVacuumShmem->av_active_parallel_workers = 0;
+ }
+ else
+ {
+ /* OK, we can provide all requested workers. */
+ can_launch = nworkers;
+ AutoVacuumShmem->av_active_parallel_workers -= nworkers;
+ }
+ LWLockRelease(AutovacuumLock);
+
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+ParallelAutoVacuumReleaseWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_active_parallel_workers += nworkers;
+ Assert(AutoVacuumShmem->av_active_parallel_workers <=
+ max_parallel_autovacuum_workers);
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3382,6 +3443,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_active_parallel_workers =
+ max_parallel_autovacuum_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3432,6 +3495,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_max_parallel_autovacuum_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..40a92ceecd5 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int max_parallel_autovacuum_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..950b4300100 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_parallel_autovacuum_workers", PGC_POSTMASTER, RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &max_parallel_autovacuum_workers,
+ 0, 0, MAX_BACKENDS,
+ check_max_parallel_autovacuum_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 63f991c4f93..23f5c890f78 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -221,6 +221,8 @@
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
+#max_parallel_autovacuum_workers = 0 # disabled by default and limited by max_parallel_workers
+ # (change requires restart)
#parallel_leader_participation = on
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..7c3575b6849 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int max_parallel_autovacuum_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..b5763e6ac36 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int ParallelAutoVacuumReserveWorkers(int nworkers);
+extern void ParallelAutoVacuumReleaseWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..d4e6170d45c 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_max_parallel_autovacuum_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..16091e6a773 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int parallel_autovacuum_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +411,16 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationGetParallelAutovacuumWorkers
+ * Returns the relation's parallel_autovacuum_workers reloption setting.
+ * Note multiple eval of argument!
+ */
+#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
+ (defaultpw))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
--
2.43.0
[text/x-patch] v4-0002-Sandbox-for-parallel-index-autovacuum.patch (8.6K, 3-v4-0002-Sandbox-for-parallel-index-autovacuum.patch)
download | inline diff:
From 4a027ce082b0b0964fc2f2f1e7c341adff14f43b Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v4 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 131 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 160 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..b4022f23948
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ max_parallel_autovacuum_workers = 10
+ log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-06-17 22:36 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-06-17 22:36 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sun, May 25, 2025 at 10:22 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <[email protected]> wrote:
> >
> > On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <[email protected]> wrote:
> > >
> > > On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <[email protected]> wrote:
> > > >
> > > > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > > > would be better to have a more specific name for parallel vacuum such
> > > > as autovacuum_max_parallel_workers. This parameter is related to
> > > > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > > > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > > > affect this parameter.
> > >
> > > This was my headache when I created names for variables. Autovacuum
> > > initially implies parallelism, because we have several parallel a/v
> > > workers.
> >
> > I'm not sure if it's parallelism. We can have multiple autovacuum
> > workers simultaneously working on different tables, which seems not
> > parallelism to me.
>
> Hm, I didn't thought about the 'parallelism' definition in this way.
> But I see your point - the next v4 patch will contain the naming that
> you suggest.
>
> >
> > > So I think that parameter like
> > > `autovacuum_max_parallel_workers` will confuse somebody.
> > > If we want to have a more specific name, I would prefer
> > > `max_parallel_index_autovacuum_workers`.
> >
> > It's better not to use 'index' as we're trying to extend parallel
> > vacuum to heap scanning/vacuuming as well[1].
>
> OK, I'll fix it.
>
> > > > + /*
> > > > + * If we are running autovacuum - decide whether we need to process indexes
> > > > + * of table with given oid in parallel.
> > > > + */
> > > > + if (AmAutoVacuumWorkerProcess() &&
> > > > + params->index_cleanup != VACOPTVALUE_DISABLED &&
> > > > + RelationAllowsParallelIdxAutovac(rel))
> > > >
> > > > I think that this should be done in autovacuum code.
> > >
> > > We need params->index cleanup variable to decide whether we need to
> > > use parallel index a/v. In autovacuum.c we have this code :
> > > ***
> > > /*
> > > * index_cleanup and truncate are unspecified at first in autovacuum.
> > > * They will be filled in with usable values using their reloptions
> > > * (or reloption defaults) later.
> > > */
> > > tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
> > > tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
> > > ***
> > > This variable is filled in inside the `vacuum_rel` function, so I
> > > think we should keep the above logic in vacuum.c.
> >
> > I guess that we can specify the parallel degree even if index_cleanup
> > is still UNSPECIFIED. vacuum_rel() would then decide whether to use
> > index vacuuming and vacuumlazy.c would decide whether to use parallel
> > vacuum based on the specified parallel degree and index_cleanup value.
> >
> > >
> > > > +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
> > > >
> > > > These fixed values really useful in common cases? I think we already
> > > > have an optimization where we skip vacuum indexes if the table has
> > > > fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
> > >
> > > When we allocate dead items (and optionally init parallel autocuum) we
> > > don't have sane value for `vacrel->lpdead_item_pages` (which should be
> > > compared with BYPASS_THRESHOLD_PAGES).
> > > The only criterion that we can focus on is the number of dead tuples
> > > indicated in the PgStat_StatTabEntry.
> >
> > My point is that this criterion might not be useful. We have the
> > bypass optimization for index vacuuming and having many dead tuples
> > doesn't necessarily mean index vacuuming taking a long time. For
> > example, even if the table has a few dead tuples, index vacuuming
> > could take a very long time and parallel index vacuuming would help
> > the situation, if the table is very large and has many indexes.
>
> That sounds reasonable. I'll fix it.
>
> > > But autovacuum (as I think) should work as stable as possible and
> > > `unnoticed` by other processes. Thus, we must :
> > > 1) Compute resources (such as the number of parallel workers for a
> > > single table's indexes vacuuming) as efficiently as possible.
> > > 2) Provide a guarantee that as many tables as possible (among
> > > requested) will be processed in parallel.
> > >
> > > (1) can be achieved by calculating the parameters on the fly.
> > > NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
> > > accurate value in the near future.
> >
> > I think it requires more things than the number of indexes on the
> > table to achieve (1). Suppose that there is a very large table that
> > gets updates heavily and has a few indexes. If users want to avoid the
> > table from being bloated, it would be a reasonable idea to use
> > parallel vacuum during autovacuum and it would not be a good idea to
> > disallow using parallel vacuum solely because it doesn't have more
> > than 30 indexes. On the other hand, if the table had got many updates
> > but not so now, users might want to use resources for autovacuums on
> > other tables. We might need to consider autovacuum frequencies per
> > table, the statistics of the previous autovacuum, or system loads etc.
> > So I think that in order to achieve (1) we might need more statistics
> > and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.
> >
>
> It's hard for me to imagine exactly how extended statistics will help
> us track such situations.
> It seems that for any of our heuristics, it will be possible to come
> up with a counter example.
> Maybe we can give advices (via logs) to the user? But for such an
> idea, tests should be conducted so that we can understand when
> resource consumption becomes ineffective.
> I guess that we need to agree on an implementation before conducting such tests.
>
> > > (2) can be achieved by workers reserving - we know that N workers
> > > (from bgworkers pool) are *always* at our disposal. And when we use
> > > such workers we are not dependent on other operations in the cluster
> > > and we don't interfere with other operations by taking resources away
> > > from them.
> >
> > Reserving some bgworkers for autovacuum could make sense. But I think
> > it's better to implement it in a general way as it could be useful in
> > other use cases too. That is, it might be a good to implement
> > infrastructure so that any PostgreSQL code (possibly including
> > extensions) can request allocating a pool of bgworkers for specific
> > usage and use bgworkers from them.
>
> Reserving infrastructure is an ambitious idea. I am not sure that we
> should implement it within this thread and feature.
> Maybe we should create a separate thread for it and as a
> justification, refer to parallel autovacuum?
>
> -----
> Thanks everybody for feedback! I attach a v4 patch to this letter.
> Main features :
> 1) 'parallel_autovacuum_workers' reloption - integer value, that sets
> the maximum number of parallel a/v workers that can be taken from
> bgworkers pool in order to process this table.
> 2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
> maximum total number of parallel a/v workers, that can be taken from
> bgworkers pool.
> 3) Parallel autovacuum does not try to use thresholds like
> NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
> 4) Parallel autovacuum now can report statistics like "planned vs. launched".
> 5) For now I got rid of the 'reserving' idea, so now autovacuum
> leaders are competing with everyone for parallel workers from the
> bgworkers pool.
>
> What do you think about this implementation?
>
I think it basically makes sense to me. A few comments:
---
The patch implements max_parallel_autovacuum_workers as a
PGC_POSTMASTER parameter but can we make it PGC_SIGHUP? I think we
don't necessarily need to make it a PGC_POSTMATER since it actually
doesn't affect how much shared memory we need to allocate.
---
I think it's better to have the prefix "autovacuum" for the new GUC
parameter for better consistency with other autovacuum-related GUC
parameters.
---
#include "storage/spin.h"
@@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
{
WaitForParallelWorkersToFinish(pcxt);
WaitForParallelWorkersToExit(pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
pcxt->nworkers_launched = 0;
if (pcxt->known_attached_workers)
{
@@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt)
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
RESUME_INTERRUPTS();
I think that it's better to release workers in vacuumparallel.c rather
than parallel.c.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-06-18 08:03 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-06-18 08:03 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Wed, Jun 18, 2025 at 5:37 AM Masahiko Sawada <[email protected]> wrote:
>
> On Sun, May 25, 2025 at 10:22 AM Daniil Davydov <[email protected]> wrote:
> >
> > Thanks everybody for feedback! I attach a v4 patch to this letter.
> > Main features :
> > 1) 'parallel_autovacuum_workers' reloption - integer value, that sets
> > the maximum number of parallel a/v workers that can be taken from
> > bgworkers pool in order to process this table.
> > 2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
> > maximum total number of parallel a/v workers, that can be taken from
> > bgworkers pool.
> > 3) Parallel autovacuum does not try to use thresholds like
> > NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
> > 4) Parallel autovacuum now can report statistics like "planned vs. launched".
> > 5) For now I got rid of the 'reserving' idea, so now autovacuum
> > leaders are competing with everyone for parallel workers from the
> > bgworkers pool.
> >
> > What do you think about this implementation?
> >
>
> I think it basically makes sense to me. A few comments:
>
> ---
> The patch implements max_parallel_autovacuum_workers as a
> PGC_POSTMASTER parameter but can we make it PGC_SIGHUP? I think we
> don't necessarily need to make it a PGC_POSTMATER since it actually
> doesn't affect how much shared memory we need to allocate.
>
Yep, there's nothing stopping us from doing that. This is a usable
feature, I'll implement it in the v5 patch.
> ---
> I think it's better to have the prefix "autovacuum" for the new GUC
> parameter for better consistency with other autovacuum-related GUC
> parameters.
>
> ---
> #include "storage/spin.h"
> @@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
> {
> WaitForParallelWorkersToFinish(pcxt);
> WaitForParallelWorkersToExit(pcxt);
> +
> + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> + if (AmAutoVacuumWorkerProcess())
> + ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
> +
> pcxt->nworkers_launched = 0;
> if (pcxt->known_attached_workers)
> {
> @@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt)
> */
> HOLD_INTERRUPTS();
> WaitForParallelWorkersToExit(pcxt);
> +
> + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> + if (AmAutoVacuumWorkerProcess())
> + ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
> +
> RESUME_INTERRUPTS();
>
> I think that it's better to release workers in vacuumparallel.c rather
> than parallel.c.
>
Agree with both comments.
Thanks for the review! Please, see v5 patch :
1) GUC variable and field in autovacuum shmem are renamed
2) ParallelAutoVacuumReleaseWorkers call moved from parallel.c to
vacuumparallel.c
3) max_parallel_autovacuum_workers is now PGC_SIGHUP parameter
4) Fix little bug (ParallelAutoVacuumReleaseWorkers in autovacuum.c:735)
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v5-0002-Sandbox-for-parallel-index-autovacuum.patch (8.6K, 2-v5-0002-Sandbox-for-parallel-index-autovacuum.patch)
download | inline diff:
From 144c2dfda58103638435bccc55e8fe8d27dd1fad Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v5 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 131 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 160 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..ae892e5b4de
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
[text/x-patch] v5-0001-Parallel-index-autovacuum-with-bgworkers.patch (23.8K, 3-v5-0001-Parallel-index-autovacuum-with-bgworkers.patch)
download | inline diff:
From 88e55d49895ebc287213a415c242b4733cdecba8 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v5 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 12 ++
src/backend/access/heap/vacuumlazy.c | 6 +-
src/backend/commands/vacuumparallel.c | 93 ++++++++---
src/backend/postmaster/autovacuum.c | 144 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 12 ++
11 files changed, 259 insertions(+), 27 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..e36d59f632b 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1882,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 09416450af9..b89b1563444 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3493,6 +3493,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
+
/*
* Initialize state for a parallel vacuum. As of now, only one worker can
* be used for an index, so we invoke parallelism only if there are at
@@ -3519,7 +3523,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
vacrel->nindexes, nworkers,
vac_work_mem,
- vacrel->verbose ? INFO : DEBUG2,
+ elevel,
vacrel->bstrategy);
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..bd314d23298 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,16 +1,16 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
- * launch parallel worker processes at the start of parallel index
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one [auto]vacuum process. ParallelVacuumState contains shared information
+ * as well as the memory space for storing dead items allocated in the DSA area.
+ * We launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +445,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +465,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -541,7 +559,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*
* nrequested is the number of parallel workers that user requested. If
* nrequested is 0, we compute the parallel degree based on nindexes, that is
- * the number of indexes that support parallel vacuum. This function also
+ * the number of indexes that support parallel [auto]vacuum. This function also
* sets will_parallel_vacuum to remember indexes that participate in parallel
* vacuum.
*/
@@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, autovacuum_max_parallel_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -666,13 +688,26 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Reset the parallel index processing and progress counters */
pg_atomic_write_u32(&(pvs->shared->idx), 0);
+ /* Check how many workers can provide autovacuum. */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
/* Setup the shared cost-based vacuum delay and launch workers */
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
if (num_index_scans > 0)
+ {
ReinitializeParallelDSM(pvs->pcxt);
+ /*
+ * Release all launched (i.e. reserved) parallel autovacuum
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched);
+ }
+
/*
* Set up shared cost balance and the number of active workers for
* vacuum delay. We need to do this before launching workers as
@@ -690,6 +725,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ ParallelAutoVacuumReleaseWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -706,16 +751,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (vacuum)
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
- "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)",
+ "launched %d parallel %svacuum workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
else
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
- "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index cleanup (planned: %d)",
+ "launched %d parallel %svacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
}
/* Vacuum the indexes that can be processed by only leader process */
@@ -982,8 +1027,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 451fb90a610..60600b9ff52 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_available_parallel_workers the number of available parallel autovacuum
+ * workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +301,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_available_parallel_workers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +357,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void check_parallel_av_gucs(int prev_max_parallel_workers);
@@ -753,7 +757,9 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev;
+ autovacuum_max_parallel_workers_prev = autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +775,14 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ check_parallel_av_gucs(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2861,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->parallel_autovacuum_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3347,72 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'max_parallel_autovacuum_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_available_parallel_workers < nworkers)
+ {
+ /* Provide as many workers as we can. */
+ can_launch = AutoVacuumShmem->av_available_parallel_workers;
+ AutoVacuumShmem->av_available_parallel_workers = 0;
+ }
+ else
+ {
+ /* OK, we can provide all requested workers. */
+ can_launch = nworkers;
+ AutoVacuumShmem->av_available_parallel_workers -= nworkers;
+ }
+ LWLockRelease(AutovacuumLock);
+
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+ParallelAutoVacuumReleaseWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_available_parallel_workers += nworkers;
+
+ /*
+ * If autovacuum_max_parallel_workers variable was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3473,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3525,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3565,48 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
+
+ /*
+ * Number of available workers must not exeed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exeed limit after releasing
+ * them (see ParallelAutoVacuumReleaseWorkers).
+ */
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
+ }
+ else if ((AutoVacuumShmem->av_available_parallel_workers <
+ autovacuum_max_parallel_workers) &&
+ (autovacuum_max_parallel_workers > prev_max_parallel_workers))
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of available workers in shmem.
+ */
+ AutoVacuumShmem->av_available_parallel_workers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+
+ /*
+ * Nothing to do when autovacuum_max_parallel_workers <
+ * prev_max_parallel_workers. Available workers number will be capped
+ * inside ParallelAutoVacuumReleaseWorkers.
+ */
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..977644978c1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f04bfedb2fd..be76263c431 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_max_parallel_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 341f88adc87..f2b6ba7755e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..b5763e6ac36 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int ParallelAutoVacuumReserveWorkers(int nworkers);
+extern void ParallelAutoVacuumReleaseWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..5c66f37cd53 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..16091e6a773 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int parallel_autovacuum_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +411,16 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationGetParallelAutovacuumWorkers
+ * Returns the relation's parallel_autovacuum_workers reloption setting.
+ * Note multiple eval of argument!
+ */
+#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
+ (defaultpw))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-04 14:21 Matheus Alcantara <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Matheus Alcantara @ 2025-07-04 14:21 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed Jun 18, 2025 at 5:03 AM -03, Daniil Davydov wrote:
>
> Thanks for the review! Please, see v5 patch :
> 1) GUC variable and field in autovacuum shmem are renamed
> 2) ParallelAutoVacuumReleaseWorkers call moved from parallel.c to
> vacuumparallel.c
> 3) max_parallel_autovacuum_workers is now PGC_SIGHUP parameter
> 4) Fix little bug (ParallelAutoVacuumReleaseWorkers in autovacuum.c:735)
>
Thanks for the new version!
The "autovacuum_max_parallel_workers" declared on guc_tables.c mention
that is capped by "max_worker_process":
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_max_parallel_workers, NULL, NULL
+ },
But the postgresql.conf.sample say that it is limited by
max_parallel_workers:
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers
IIUC the code, it cap by "max_worker_process", but Masahiko has mention
on [1] that it should be capped by max_parallel_workers.
---
We actually capping the autovacuum_max_parallel_workers by
max_worker_process-1, so we can't have 10 max_worker_process and 10
autovacuum_max_parallel_workers. Is that correct?
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
---
Locking unnecessary the AutovacuumLock if none if the if's is true can
cause some performance issue here? I don't think that this would be a
serious problem because this code will only be called if the
configuration file is changed during the autovacuum execution right? But
I could be wrong, so just sharing my thoughts on this (still learning
about [auto]vacuum code).
+
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
+
Typo on "exeed"
+ /*
+ * Number of available workers must not exeed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exeed limit after releasing
+ * them (see ParallelAutoVacuumReleaseWorkers).
+ */
---
I'm not seeing an usage of this macro?
+/*
+ * RelationGetParallelAutovacuumWorkers
+ * Returns the relation's parallel_autovacuum_workers reloption setting.
+ * Note multiple eval of argument!
+ */
+#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
+ (defaultpw))
+
---
Also pgindent is needed on some files.
---
I've made some tests and I can confirm that is working correctly for
what I can see. I think that would be to start include the documentation
changes, what do you think?
[1] https://www.postgresql.org/message-id/CAD21AoAxTkpkLtJDgrH9dXg_h%2ByzOZpOZj3B-4FjW1Mr4qEdbQ%40mail.g...
--
Matheus Alcantara
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-06 08:00 Daniil Davydov <[email protected]>
parent: Matheus Alcantara <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-07-06 08:00 UTC (permalink / raw)
To: Matheus Alcantara <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Fri, Jul 4, 2025 at 9:21 PM Matheus Alcantara
<[email protected]> wrote:
>
> The "autovacuum_max_parallel_workers" declared on guc_tables.c mention
> that is capped by "max_worker_process":
> + {
> + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
> + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
> + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
> + },
> + &autovacuum_max_parallel_workers,
> + 0, 0, MAX_BACKENDS,
> + check_autovacuum_max_parallel_workers, NULL, NULL
> + },
>
> IIUC the code, it cap by "max_worker_process", but Masahiko has mention
> on [1] that it should be capped by max_parallel_workers.
>
Thanks for looking into it!
To be honest, I don't think that this parameter should be explicitly
capped at all.
Other parallel operations (for example parallel index build or VACUUM
PARALLEL) just request as many workers as they want without looking at
'max_parallel_workers'.
And they will not complain, if not all requested workers were launched.
Thus, even if 'autovacuum_max_parallel_workers' is higher than
'max_parallel_workers' the worst that can happen is that not all
requested workers will be running (which is a common situation).
Users can handle it by looking for logs like "planned vs. launched"
and increasing 'max_parallel_workers' if needed.
On the other hand, obviously it doesn't make sense to request more
workers than 'max_worker_processes' (moreover, this parameter cannot
be changed as easily as 'max_parallel_workers').
I will keep the 'max_worker_processes' limit, so autovacuum will not
waste time initializing a parallel context if there is no chance that
the request will succeed.
But it's worth remembering that actually the
'autovacuum_max_parallel_workers' parameter will always be implicitly
capped by 'max_parallel_workers'.
What do you think about it?
> But the postgresql.conf.sample say that it is limited by
> max_parallel_workers:
> +#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers
Good catch, I'll fix it.
> ---
>
> We actually capping the autovacuum_max_parallel_workers by
> max_worker_process-1, so we can't have 10 max_worker_process and 10
> autovacuum_max_parallel_workers. Is that correct?
Yep. The explanation can be found just above in this letter.
> ---
>
> Locking unnecessary the AutovacuumLock if none if the if's is true can
> cause some performance issue here? I don't think that this would be a
> serious problem because this code will only be called if the
> configuration file is changed during the autovacuum execution right? But
> I could be wrong, so just sharing my thoughts on this (still learning
> about [auto]vacuum code).
>
> +
> +/*
> + * Make sure that number of available parallel workers corresponds to the
> + * autovacuum_max_parallel_workers parameter (after it was changed).
> + */
> +static void
> +check_parallel_av_gucs(int prev_max_parallel_workers)
> +{
> + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> +
> + if (AutoVacuumShmem->av_available_parallel_workers >
> + autovacuum_max_parallel_workers)
> + {
> + Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
> +
>
This function may be called by a/v launcher when we already have some
a/v workers running.
A/v workers can change the
AutoVacuumShmem->av_available_parallel_workers value, so I think we
should acquire appropriate lock before reading it.
> Typo on "exeed"
>
> + /*
> + * Number of available workers must not exeed limit.
> + *
> + * Note, that if some parallel autovacuum workers are running at this
> + * moment, available workers number will not exeed limit after releasing
> + * them (see ParallelAutoVacuumReleaseWorkers).
> + */
Oops. I'll fix it.
> ---
>
> I'm not seeing an usage of this macro?
> +/*
> + * RelationGetParallelAutovacuumWorkers
> + * Returns the relation's parallel_autovacuum_workers reloption setting.
> + * Note multiple eval of argument!
> + */
> +#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
> + ((relation)->rd_options ? \
> + ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
> + (defaultpw))
> +
>
Yes, this is the relic of a past implementation. I'll delete this macro.
>
> I've made some tests and I can confirm that is working correctly for
> what I can see. I think that would be to start include the documentation
> changes, what do you think?
>
It sounds tempting :)
But perhaps first we should agree on the limitation of the
'autovacuum_max_parallel_workers' parameter.
Please, see v6 patches :
1) Fixed typos in autovacuum.c and postgresql.conf.sample
2) Removed unused macro 'RelationGetParallelAutovacuumWorkers'
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v6-0001-Parallel-index-autovacuum-with-bgworkers.patch (23.2K, 2-v6-0001-Parallel-index-autovacuum-with-bgworkers.patch)
download | inline diff:
From 20ef6a60d7eb4bbfa2d3e36ff36301abb26e4622 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v6 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 12 ++
src/backend/access/heap/vacuumlazy.c | 6 +-
src/backend/commands/vacuumparallel.c | 93 ++++++++---
src/backend/postmaster/autovacuum.c | 144 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 2 +
11 files changed, 249 insertions(+), 27 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..e36d59f632b 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1882,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 09416450af9..b89b1563444 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3493,6 +3493,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
+
/*
* Initialize state for a parallel vacuum. As of now, only one worker can
* be used for an index, so we invoke parallelism only if there are at
@@ -3519,7 +3523,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
vacrel->nindexes, nworkers,
vac_work_mem,
- vacrel->verbose ? INFO : DEBUG2,
+ elevel,
vacrel->bstrategy);
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..bd314d23298 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,16 +1,16 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
- * launch parallel worker processes at the start of parallel index
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one [auto]vacuum process. ParallelVacuumState contains shared information
+ * as well as the memory space for storing dead items allocated in the DSA area.
+ * We launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +445,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +465,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -541,7 +559,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*
* nrequested is the number of parallel workers that user requested. If
* nrequested is 0, we compute the parallel degree based on nindexes, that is
- * the number of indexes that support parallel vacuum. This function also
+ * the number of indexes that support parallel [auto]vacuum. This function also
* sets will_parallel_vacuum to remember indexes that participate in parallel
* vacuum.
*/
@@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, autovacuum_max_parallel_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -666,13 +688,26 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Reset the parallel index processing and progress counters */
pg_atomic_write_u32(&(pvs->shared->idx), 0);
+ /* Check how many workers can provide autovacuum. */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
/* Setup the shared cost-based vacuum delay and launch workers */
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
if (num_index_scans > 0)
+ {
ReinitializeParallelDSM(pvs->pcxt);
+ /*
+ * Release all launched (i.e. reserved) parallel autovacuum
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched);
+ }
+
/*
* Set up shared cost balance and the number of active workers for
* vacuum delay. We need to do this before launching workers as
@@ -690,6 +725,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ ParallelAutoVacuumReleaseWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -706,16 +751,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (vacuum)
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
- "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)",
+ "launched %d parallel %svacuum workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
else
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
- "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index cleanup (planned: %d)",
+ "launched %d parallel %svacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
}
/* Vacuum the indexes that can be processed by only leader process */
@@ -982,8 +1027,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 451fb90a610..9e8b00ae0cb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_available_parallel_workers the number of available parallel autovacuum
+ * workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +301,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_available_parallel_workers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +357,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void check_parallel_av_gucs(int prev_max_parallel_workers);
@@ -753,7 +757,9 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev;
+ autovacuum_max_parallel_workers_prev = autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +775,14 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ check_parallel_av_gucs(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2861,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->parallel_autovacuum_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3347,72 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'max_parallel_autovacuum_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_available_parallel_workers < nworkers)
+ {
+ /* Provide as many workers as we can. */
+ can_launch = AutoVacuumShmem->av_available_parallel_workers;
+ AutoVacuumShmem->av_available_parallel_workers = 0;
+ }
+ else
+ {
+ /* OK, we can provide all requested workers. */
+ can_launch = nworkers;
+ AutoVacuumShmem->av_available_parallel_workers -= nworkers;
+ }
+ LWLockRelease(AutovacuumLock);
+
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+ParallelAutoVacuumReleaseWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_available_parallel_workers += nworkers;
+
+ /*
+ * If autovacuum_max_parallel_workers variable was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3473,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3525,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3565,48 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
+
+ /*
+ * Number of available workers must not exceed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exceed limit after
+ * releasing them (see ParallelAutoVacuumReleaseWorkers).
+ */
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
+ }
+ else if ((AutoVacuumShmem->av_available_parallel_workers <
+ autovacuum_max_parallel_workers) &&
+ (autovacuum_max_parallel_workers > prev_max_parallel_workers))
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of available workers in shmem.
+ */
+ AutoVacuumShmem->av_available_parallel_workers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+
+ /*
+ * Nothing to do when autovacuum_max_parallel_workers <
+ * prev_max_parallel_workers. Available workers number will be capped
+ * inside ParallelAutoVacuumReleaseWorkers.
+ */
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..977644978c1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f04bfedb2fd..be76263c431 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_max_parallel_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 341f88adc87..3fbcbf8ef4f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..b5763e6ac36 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int ParallelAutoVacuumReserveWorkers(int nworkers);
+extern void ParallelAutoVacuumReleaseWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..5c66f37cd53 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..29c32f75780 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int parallel_autovacuum_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v6-0002-Sandbox-for-parallel-index-autovacuum.patch (8.6K, 3-v6-0002-Sandbox-for-parallel-index-autovacuum.patch)
download | inline diff:
From 6164c2cd633e9f3f95682e02d819c890519eef7c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v6 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 131 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 160 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..ae892e5b4de
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-08 15:20 Matheus Alcantara <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Matheus Alcantara @ 2025-07-08 15:20 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sun Jul 6, 2025 at 5:00 AM -03, Daniil Davydov wrote:
>> The "autovacuum_max_parallel_workers" declared on guc_tables.c mention
>> that is capped by "max_worker_process":
>> + {
>> + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
>> + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
>> + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
>> + },
>> + &autovacuum_max_parallel_workers,
>> + 0, 0, MAX_BACKENDS,
>> + check_autovacuum_max_parallel_workers, NULL, NULL
>> + },
>>
>> IIUC the code, it cap by "max_worker_process", but Masahiko has mention
>> on [1] that it should be capped by max_parallel_workers.
> To be honest, I don't think that this parameter should be explicitly
> capped at all.
> Other parallel operations (for example parallel index build or VACUUM
> PARALLEL) just request as many workers as they want without looking at
> 'max_parallel_workers'.
> And they will not complain, if not all requested workers were launched.
>
> Thus, even if 'autovacuum_max_parallel_workers' is higher than
> 'max_parallel_workers' the worst that can happen is that not all
> requested workers will be running (which is a common situation).
> Users can handle it by looking for logs like "planned vs. launched"
> and increasing 'max_parallel_workers' if needed.
>
> On the other hand, obviously it doesn't make sense to request more
> workers than 'max_worker_processes' (moreover, this parameter cannot
> be changed as easily as 'max_parallel_workers').
>
> I will keep the 'max_worker_processes' limit, so autovacuum will not
> waste time initializing a parallel context if there is no chance that
> the request will succeed.
> But it's worth remembering that actually the
> 'autovacuum_max_parallel_workers' parameter will always be implicitly
> capped by 'max_parallel_workers'.
>
> What do you think about it?
>
It make sense to me. The main benefit that I see on capping
autovacuum_max_parallel_workers parameter is that users will see
"invalid value for parameter "autovacuum_max_parallel_workers"" error on
logs instead of need to search for "planned vs. launched", which can be
trick if log_min_messages is not set to at least the info level (the
default warning level will not show this log message). If we decide to
not cap this on code I think that at least would be good to mention this
on documentation.
>>
>> I've made some tests and I can confirm that is working correctly for
>> what I can see. I think that would be to start include the documentation
>> changes, what do you think?
>>
>
> It sounds tempting :)
> But perhaps first we should agree on the limitation of the
> 'autovacuum_max_parallel_workers' parameter.
>
Agree
> Please, see v6 patches :
> 1) Fixed typos in autovacuum.c and postgresql.conf.sample
> 2) Removed unused macro 'RelationGetParallelAutovacuumWorkers'
>
Thanks!
--
Matheus Alcantara
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-09 05:26 Daniil Davydov <[email protected]>
parent: Matheus Alcantara <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-07-09 05:26 UTC (permalink / raw)
To: Matheus Alcantara <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Jul 8, 2025 at 10:20 PM Matheus Alcantara
<[email protected]> wrote:
>
> On Sun Jul 6, 2025 at 5:00 AM -03, Daniil Davydov wrote:
> > I will keep the 'max_worker_processes' limit, so autovacuum will not
> > waste time initializing a parallel context if there is no chance that
> > the request will succeed.
> > But it's worth remembering that actually the
> > 'autovacuum_max_parallel_workers' parameter will always be implicitly
> > capped by 'max_parallel_workers'.
> >
> > What do you think about it?
> >
>
> It make sense to me. The main benefit that I see on capping
> autovacuum_max_parallel_workers parameter is that users will see
> "invalid value for parameter "autovacuum_max_parallel_workers"" error on
> logs instead of need to search for "planned vs. launched", which can be
> trick if log_min_messages is not set to at least the info level (the
> default warning level will not show this log message).
>
I think I can refer to (for example) 'max_parallel_workers_per_gather'
parameter, which allows
setting values higher than 'max_parallel_workers' without throwing an
error or warning.
'autovacuum_max_parallel_workers' will behave the same way.
> If we decide to not cap this on code I think that at least would be good to mention this
> on documentation.
Sure, it is worth noticing in documentation.
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-14 07:09 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-07-14 07:09 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sun, Jul 6, 2025 at 5:00 PM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Fri, Jul 4, 2025 at 9:21 PM Matheus Alcantara
> <[email protected]> wrote:
> >
> > The "autovacuum_max_parallel_workers" declared on guc_tables.c mention
> > that is capped by "max_worker_process":
> > + {
> > + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
> > + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
> > + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
> > + },
> > + &autovacuum_max_parallel_workers,
> > + 0, 0, MAX_BACKENDS,
> > + check_autovacuum_max_parallel_workers, NULL, NULL
> > + },
> >
> > IIUC the code, it cap by "max_worker_process", but Masahiko has mention
> > on [1] that it should be capped by max_parallel_workers.
> >
>
> Thanks for looking into it!
>
> To be honest, I don't think that this parameter should be explicitly
> capped at all.
> Other parallel operations (for example parallel index build or VACUUM
> PARALLEL) just request as many workers as they want without looking at
> 'max_parallel_workers'.
> And they will not complain, if not all requested workers were launched.
>
> Thus, even if 'autovacuum_max_parallel_workers' is higher than
> 'max_parallel_workers' the worst that can happen is that not all
> requested workers will be running (which is a common situation).
> Users can handle it by looking for logs like "planned vs. launched"
> and increasing 'max_parallel_workers' if needed.
>
> On the other hand, obviously it doesn't make sense to request more
> workers than 'max_worker_processes' (moreover, this parameter cannot
> be changed as easily as 'max_parallel_workers').
>
> I will keep the 'max_worker_processes' limit, so autovacuum will not
> waste time initializing a parallel context if there is no chance that
> the request will succeed.
> But it's worth remembering that actually the
> 'autovacuum_max_parallel_workers' parameter will always be implicitly
> capped by 'max_parallel_workers'.
>
> What do you think about it?
>
> > But the postgresql.conf.sample say that it is limited by
> > max_parallel_workers:
> > +#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers
>
> Good catch, I'll fix it.
>
> > ---
> >
> > We actually capping the autovacuum_max_parallel_workers by
> > max_worker_process-1, so we can't have 10 max_worker_process and 10
> > autovacuum_max_parallel_workers. Is that correct?
>
> Yep. The explanation can be found just above in this letter.
>
> > ---
> >
> > Locking unnecessary the AutovacuumLock if none if the if's is true can
> > cause some performance issue here? I don't think that this would be a
> > serious problem because this code will only be called if the
> > configuration file is changed during the autovacuum execution right? But
> > I could be wrong, so just sharing my thoughts on this (still learning
> > about [auto]vacuum code).
> >
> > +
> > +/*
> > + * Make sure that number of available parallel workers corresponds to the
> > + * autovacuum_max_parallel_workers parameter (after it was changed).
> > + */
> > +static void
> > +check_parallel_av_gucs(int prev_max_parallel_workers)
> > +{
> > + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> > +
> > + if (AutoVacuumShmem->av_available_parallel_workers >
> > + autovacuum_max_parallel_workers)
> > + {
> > + Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
> > +
> >
>
> This function may be called by a/v launcher when we already have some
> a/v workers running.
> A/v workers can change the
> AutoVacuumShmem->av_available_parallel_workers value, so I think we
> should acquire appropriate lock before reading it.
>
> > Typo on "exeed"
> >
> > + /*
> > + * Number of available workers must not exeed limit.
> > + *
> > + * Note, that if some parallel autovacuum workers are running at this
> > + * moment, available workers number will not exeed limit after releasing
> > + * them (see ParallelAutoVacuumReleaseWorkers).
> > + */
>
> Oops. I'll fix it.
>
> > ---
> >
> > I'm not seeing an usage of this macro?
> > +/*
> > + * RelationGetParallelAutovacuumWorkers
> > + * Returns the relation's parallel_autovacuum_workers reloption setting.
> > + * Note multiple eval of argument!
> > + */
> > +#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
> > + ((relation)->rd_options ? \
> > + ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
> > + (defaultpw))
> > +
> >
>
> Yes, this is the relic of a past implementation. I'll delete this macro.
>
> >
> > I've made some tests and I can confirm that is working correctly for
> > what I can see. I think that would be to start include the documentation
> > changes, what do you think?
> >
>
> It sounds tempting :)
> But perhaps first we should agree on the limitation of the
> 'autovacuum_max_parallel_workers' parameter.
>
> Please, see v6 patches :
> 1) Fixed typos in autovacuum.c and postgresql.conf.sample
> 2) Removed unused macro 'RelationGetParallelAutovacuumWorkers'
>
Thank you for updating the patch! Here are some review comments:
---
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
Since we have a similar code in dead_items_alloc() I think it's better
to follow it:
int vac_work_mem = AmAutoVacuumWorkerProcess() &&
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
That is, we calculate vac_work_mem first and then calculate
shared->maintenance_work_mem_worker. I think it's more straightforward
as the formula of maintenance_work_mem_worker is the same whereas the
amount of memory used for vacuum and autovacuum varies.
---
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
+
Why don't we release workers before destroying the parallel context?
---
@@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels,
int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation
*indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, autovacuum_max_parallel_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
How about calculating the maximum number of workers once and using it
in the above both places?
---
+ /* Check how many workers can provide autovacuum. */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
I think it's better to move this code to right after setting "nworkers
= Min(nworkers, pvs->pcxt->nworkers);" as it's a more related code.
The comment needs to be updated as it doesn't match what the function
actually does (i.e. reserving the workers).
---
/* Reinitialize parallel context to relaunch parallel workers */
if (num_index_scans > 0)
+ {
ReinitializeParallelDSM(pvs->pcxt);
+ /*
+ * Release all launched (i.e. reserved) parallel autovacuum
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched);
+ }
Why do we need to release all workers here? If there is a reason, we
should mention it as a comment.
---
@@ -706,16 +751,16 @@
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int
num_index_scan
if (vacuum)
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum
worker for index vacuuming (planned: %d)",
- "launched %d parallel vacuum
workers for index vacuuming (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum
worker for index vacuuming (planned: %d)",
+ "launched %d parallel %svacuum
workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched,
AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
The "%svacuum" part doesn't work in terms of translation. We need to
construct the whole sentence instead. But do we need this log message
change in the first place? IIUC autovacuums write logs only when the
execution time exceed the log_autovacuum_min_duration (or its
reloption). The patch unconditionally sets LOG level for autovacuums
but I'm not sure it's consistent with other autovacuum logging
behavior:
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
---
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
The patch includes the change of "vacuum" -> "[auto]vacuum" in many
places. While I think we need to mention that vacuumparallel.c
supports autovacuums I'm not sure we really need all of them. If we
accept this style, we would require for all subsequent changes to
follow it, which could increase maintenance costs.
---
@@ -299,6 +301,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_available_parallel_workers;
Other field names seem to have consistent naming rules; 'av_' prefix
followed by name in camel case. So how about renaming it to
av_freeParallelWorkers or something along those lines?
---
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
Other exposed functions have "AutoVacuum" prefix, so how about
renaming it to AutoVacuumReserveParallelWorkers() or something along
those lines?
---
+ if (AutoVacuumShmem->av_available_parallel_workers < nworkers)
+ {
+ /* Provide as many workers as we can. */
+ can_launch = AutoVacuumShmem->av_available_parallel_workers;
+ AutoVacuumShmem->av_available_parallel_workers = 0;
+ }
+ else
+ {
+ /* OK, we can provide all requested workers. */
+ can_launch = nworkers;
+ AutoVacuumShmem->av_available_parallel_workers -= nworkers;
+ }
Can we simplify this logic as follows?
can_launch = Min(AutoVacuumShmem->av_available_parallel_workers, nworkers);
AutoVacuumShmem->av_available_parallel_workers -= can_launch;
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-14 10:49 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-07-14 10:49 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Mon, Jul 14, 2025 at 2:10 PM Masahiko Sawada <[email protected]> wrote:
>
> ---
> - shared->maintenance_work_mem_worker =
> - (nindexes_mwm > 0) ?
> - maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
> - maintenance_work_mem;
> +
> + if (AmAutoVacuumWorkerProcess())
> + shared->maintenance_work_mem_worker =
> + (nindexes_mwm > 0) ?
> + autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
> + autovacuum_work_mem;
> + else
> + shared->maintenance_work_mem_worker =
> + (nindexes_mwm > 0) ?
> + maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
> + maintenance_work_mem;
>
> Since we have a similar code in dead_items_alloc() I think it's better
> to follow it:
>
> int vac_work_mem = AmAutoVacuumWorkerProcess() &&
> autovacuum_work_mem != -1 ?
> autovacuum_work_mem : maintenance_work_mem;
>
> That is, we calculate vac_work_mem first and then calculate
> shared->maintenance_work_mem_worker. I think it's more straightforward
> as the formula of maintenance_work_mem_worker is the same whereas the
> amount of memory used for vacuum and autovacuum varies.
>
I was confused by the fact that initially maintenance_work_mem was used
for calculations, not vac_work_mem. I agree that we should better use
already calculated vac_work_mem value.
> ---
> + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
> DestroyParallelContext(pvs->pcxt);
> +
> + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> + if (AmAutoVacuumWorkerProcess())
> + ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
> +
>
> Why don't we release workers before destroying the parallel context?
>
Destroying parallel context includes waiting for all workers to exit (after
which, other operations can use them).
If we first call ParallelAutoVacuumReleaseWorkers, some operation can
reasonably request all released workers. But this request can fail,
because there is no guarantee that workers managed to finish.
Actually, there's nothing wrong with that, but I think releasing workers
only after finishing work is a more logical approach.
> ---
> @@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels,
> int nindexes, int nrequested,
> * We don't allow performing parallel operation in standalone backend or
> * when parallelism is disabled.
> */
> - if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
> + if (!IsUnderPostmaster ||
> + (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) ||
> + (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
> return 0;
>
> /*
> @@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation
> *indrels, int nindexes, int nrequested,
> parallel_workers = (nrequested > 0) ?
> Min(nrequested, nindexes_parallel) : nindexes_parallel;
>
> - /* Cap by max_parallel_maintenance_workers */
> - parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
> + /* Cap by GUC variable */
> + parallel_workers = AmAutoVacuumWorkerProcess() ?
> + Min(parallel_workers, autovacuum_max_parallel_workers) :
> + Min(parallel_workers, max_parallel_maintenance_workers);
>
> return parallel_workers;
>
> How about calculating the maximum number of workers once and using it
> in the above both places?
>
Agree. Good idea.
> ---
> + /* Check how many workers can provide autovacuum. */
> + if (AmAutoVacuumWorkerProcess() && nworkers > 0)
> + nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
> +
>
> I think it's better to move this code to right after setting "nworkers
> = Min(nworkers, pvs->pcxt->nworkers);" as it's a more related code.
>
> The comment needs to be updated as it doesn't match what the function
> actually does (i.e. reserving the workers).
>
You are right, I'll fix it.
> ---
> /* Reinitialize parallel context to relaunch parallel workers */
> if (num_index_scans > 0)
> + {
> ReinitializeParallelDSM(pvs->pcxt);
>
> + /*
> + * Release all launched (i.e. reserved) parallel autovacuum
> + * workers.
> + */
> + if (AmAutoVacuumWorkerProcess())
> + ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched);
> + }
>
> Why do we need to release all workers here? If there is a reason, we
> should mention it as a comment.
>
Hm, I guess it was left over from previous patch versions. Actually
we don't need to release workers here, as we will try to launch them
immediately. It is a bug, thank you for noticing it.
> ---
> @@ -706,16 +751,16 @@
> parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int
> num_index_scan
>
> if (vacuum)
> ereport(pvs->shared->elevel,
> - (errmsg(ngettext("launched %d parallel vacuum
> worker for index vacuuming (planned: %d)",
> - "launched %d parallel vacuum
> workers for index vacuuming (planned: %d)",
> + (errmsg(ngettext("launched %d parallel %svacuum
> worker for index vacuuming (planned: %d)",
> + "launched %d parallel %svacuum
> workers for index vacuuming (planned: %d)",
> pvs->pcxt->nworkers_launched),
> - pvs->pcxt->nworkers_launched, nworkers)));
> + pvs->pcxt->nworkers_launched,
> AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
>
> The "%svacuum" part doesn't work in terms of translation. We need to
> construct the whole sentence instead.
> But do we need this log message
> change in the first place? IIUC autovacuums write logs only when the
> execution time exceed the log_autovacuum_min_duration (or its
> reloption). The patch unconditionally sets LOG level for autovacuums
> but I'm not sure it's consistent with other autovacuum logging
> behavior:
>
> + int elevel = AmAutoVacuumWorkerProcess() ||
> + vacrel->verbose ?
> + INFO : DEBUG2;
>
>
This log level is used only "for messages about parallel workers launched".
I think that such logs relate more to the parallel workers module than
autovacuum itself. Moreover, if we emit log "planned vs. launched" each
time, it will simplify the task of selecting the optimal value of
'autovacuum_max_parallel_workers' parameter. What do you think?
About "%svacuum" - I guess we need to clarify what exactly the workers
were launched for. I'll add errhint to this log, but I don't know whether such
approach is acceptable.
> - * Support routines for parallel vacuum execution.
> + * Support routines for parallel [auto]vacuum execution.
>
> The patch includes the change of "vacuum" -> "[auto]vacuum" in many
> places. While I think we need to mention that vacuumparallel.c
> supports autovacuums I'm not sure we really need all of them. If we
> accept this style, we would require for all subsequent changes to
> follow it, which could increase maintenance costs.
>
Agree. I'll leave a comment which says that vacuumparallel also supports
parallel autovacuum. All other changes like "[auto]vacuum" will be deleted.
> ---
> @@ -299,6 +301,7 @@ typedef struct
> WorkerInfo av_startingWorker;
> AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
> pg_atomic_uint32 av_nworkersForBalance;
> + uint32 av_available_parallel_workers;
>
> Other field names seem to have consistent naming rules; 'av_' prefix
> followed by name in camel case. So how about renaming it to
> av_freeParallelWorkers or something along those lines?
>
> ---
> +int
> +ParallelAutoVacuumReserveWorkers(int nworkers)
> +{
>
> Other exposed functions have "AutoVacuum" prefix, so how about
> renaming it to AutoVacuumReserveParallelWorkers() or something along
> those lines?
>
Agreeing with both comments, I'll rename the structure field and functions.
> ---
> + if (AutoVacuumShmem->av_available_parallel_workers < nworkers)
> + {
> + /* Provide as many workers as we can. */
> + can_launch = AutoVacuumShmem->av_available_parallel_workers;
> + AutoVacuumShmem->av_available_parallel_workers = 0;
> + }
> + else
> + {
> + /* OK, we can provide all requested workers. */
> + can_launch = nworkers;
> + AutoVacuumShmem->av_available_parallel_workers -= nworkers;
> + }
>
> Can we simplify this logic as follows?
>
> can_launch = Min(AutoVacuumShmem->av_available_parallel_workers, nworkers);
> AutoVacuumShmem->av_available_parallel_workers -= can_launch;
>
Sure, I'll simplify it.
---
Thank you very much for your comments! Please, see v7 patch :
1) Rename few functions and variables + get rid of comments like
"[auto]vacuum" in vacuumparallel.c
2) Simplified logic in 'parallel_vacuum_init' and
'AutoVacuumReserveParallelWorkers' functions
3) Refactor and bug fix in 'parallel_vacuum_process_all_indexes' function
4) Change "planned vs. launched" logging, so it can be translated
5) Rebased on newest commit in master branch
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v7-0002-Sandbox-for-parallel-index-autovacuum.patch (8.6K, 2-v7-0002-Sandbox-for-parallel-index-autovacuum.patch)
download | inline diff:
From 7af255b4d0a5e7927f6a1c212c4b2342d6b044a7 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v7 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 131 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 160 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..ae892e5b4de
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
[text/x-patch] v7-0001-Parallel-index-autovacuum-with-bgworkers.patch (19.6K, 3-v7-0001-Parallel-index-autovacuum-with-bgworkers.patch)
download | inline diff:
From 55b76f15bbc3991b7457de6c1d6998d39b16292c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v7 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 12 ++
src/backend/access/heap/vacuumlazy.c | 6 +-
src/backend/commands/vacuumparallel.c | 57 ++++++--
src/backend/postmaster/autovacuum.c | 135 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 2 +
11 files changed, 220 insertions(+), 11 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..e36d59f632b 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1882,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 14036c27e87..7e0ae0184aa 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3477,6 +3477,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
+
/*
* Initialize state for a parallel vacuum. As of now, only one worker can
* be used for an index, so we invoke parallelism only if there are at
@@ -3503,7 +3507,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
vacrel->nindexes, nworkers,
vac_work_mem,
- vacrel->verbose ? INFO : DEBUG2,
+ elevel,
vacrel->bstrategy);
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..6ec610e29e4 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -371,10 +374,12 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
+
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +440,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +460,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -553,12 +566,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_parallel_workers;
+
+ max_parallel_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_parallel_workers == 0)
return 0;
/*
@@ -597,8 +615,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_parallel_workers);
return parallel_workers;
}
@@ -646,6 +664,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Also reserve workers in autovacuum global state. Note, that we may be
+ * given fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +715,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -709,13 +744,19 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
"launched %d parallel vacuum workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, nworkers),
+ AmAutoVacuumWorkerProcess() ?
+ errhint("workers were launched for parallel autovacuum") :
+ errhint("workers were launched for parallel vacuum")));
else
ereport(pvs->shared->elevel,
(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, nworkers),
+ AmAutoVacuumWorkerProcess() ?
+ errhint("workers were launched for parallel autovacuum") :
+ errhint("workers were launched for parallel vacuum")));
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9474095f271..98609ac8f8f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +356,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void check_parallel_av_gucs(int prev_max_parallel_workers);
@@ -753,7 +756,9 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev;
+ autovacuum_max_parallel_workers_prev = autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +774,14 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ check_parallel_av_gucs(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2860,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->parallel_autovacuum_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3346,64 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ can_launch = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_freeParallelWorkers += nworkers;
+
+ /*
+ * If autovacuum_max_parallel_workers variable was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ if (AutoVacuumShmem->av_freeParallelWorkers >
+ autovacuum_max_parallel_workers)
+ {
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3464,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3516,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3556,48 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AutoVacuumShmem->av_freeParallelWorkers >
+ autovacuum_max_parallel_workers)
+ {
+ Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
+
+ /*
+ * Number of available workers must not exceed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exceed limit after
+ * releasing them (see ParallelAutoVacuumReleaseWorkers).
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
+ }
+ else if ((AutoVacuumShmem->av_freeParallelWorkers <
+ autovacuum_max_parallel_workers) &&
+ (autovacuum_max_parallel_workers > prev_max_parallel_workers))
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of available workers in shmem.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+
+ /*
+ * Nothing to do when autovacuum_max_parallel_workers <
+ * prev_max_parallel_workers. Available workers number will be capped
+ * inside ParallelAutoVacuumReleaseWorkers.
+ */
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..977644978c1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..b6a192af8f8 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_max_parallel_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..bbf5307000f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..863d206f2bd 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 82ac8646a8d..b45023a90b2 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..29c32f75780 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int parallel_autovacuum_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-17 19:42 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-07-17 19:42 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, Jul 14, 2025 at 3:49 AM Daniil Davydov <[email protected]> wrote:
>
>
> > ---
> > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
> > DestroyParallelContext(pvs->pcxt);
> > +
> > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> > + if (AmAutoVacuumWorkerProcess())
> > + ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
> > +
> >
> > Why don't we release workers before destroying the parallel context?
> >
>
> Destroying parallel context includes waiting for all workers to exit (after
> which, other operations can use them).
> If we first call ParallelAutoVacuumReleaseWorkers, some operation can
> reasonably request all released workers. But this request can fail,
> because there is no guarantee that workers managed to finish.
>
> Actually, there's nothing wrong with that, but I think releasing workers
> only after finishing work is a more logical approach.
>
> > ---
> > @@ -706,16 +751,16 @@
> > parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int
> > num_index_scan
> >
> > if (vacuum)
> > ereport(pvs->shared->elevel,
> > - (errmsg(ngettext("launched %d parallel vacuum
> > worker for index vacuuming (planned: %d)",
> > - "launched %d parallel vacuum
> > workers for index vacuuming (planned: %d)",
> > + (errmsg(ngettext("launched %d parallel %svacuum
> > worker for index vacuuming (planned: %d)",
> > + "launched %d parallel %svacuum
> > workers for index vacuuming (planned: %d)",
> > pvs->pcxt->nworkers_launched),
> > - pvs->pcxt->nworkers_launched, nworkers)));
> > + pvs->pcxt->nworkers_launched,
> > AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
> >
> > The "%svacuum" part doesn't work in terms of translation. We need to
> > construct the whole sentence instead.
> > But do we need this log message
> > change in the first place? IIUC autovacuums write logs only when the
> > execution time exceed the log_autovacuum_min_duration (or its
> > reloption). The patch unconditionally sets LOG level for autovacuums
> > but I'm not sure it's consistent with other autovacuum logging
> > behavior:
> >
> > + int elevel = AmAutoVacuumWorkerProcess() ||
> > + vacrel->verbose ?
> > + INFO : DEBUG2;
> >
> >
>
> This log level is used only "for messages about parallel workers launched".
> I think that such logs relate more to the parallel workers module than
> autovacuum itself. Moreover, if we emit log "planned vs. launched" each
> time, it will simplify the task of selecting the optimal value of
> 'autovacuum_max_parallel_workers' parameter. What do you think?
INFO level is normally not sent to the server log. And regarding
autovacuums, they don't write any log mentioning it started. If we
want to write planned vs. launched I think it's better to gather these
statistics during execution and write it together with other existing
logs.
>
> About "%svacuum" - I guess we need to clarify what exactly the workers
> were launched for. I'll add errhint to this log, but I don't know whether such
> approach is acceptable.
I'm not sure errhint is an appropriate place. If we write such
information together with other existing autovacuum logs as I
suggested above, I think we don't need to add such information to this
log message.
I've reviewed v7 patch and here are some comments:
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be
taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on
number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
Many autovacuum related reloptions have the prefix "autovacuum". So
how about renaming it to autovacuum_parallel_worker (change
check_parallel_av_gucs() name too accordingly)?
---
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
I think we don't need to strictly check the
autovacuum_max_parallel_workers value. Instead, we can accept any
integer value but internally cap by max_worker_processes.
---
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
I think this function doesn't just check the value but does adjust the
number of available workers, so how about
adjust_free_parallel_workers() or something along these lines?
---
+ /*
+ * Number of available workers must not exceed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exceed limit after
+ * releasing them (see ParallelAutoVacuumReleaseWorkers).
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
I think the comment refers to the following code in
AutoVacuumReleaseParallelWorkers():
+ /*
+ * If autovacuum_max_parallel_workers variable was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ if (AutoVacuumShmem->av_freeParallelWorkers >
+ autovacuum_max_parallel_workers)
+ {
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
+ }
After the autovacuum launchers decreases av_freeParallelWorkers, it's
not guaranteed that the autovacuum worker already reflects the new
value from the config file when executing the
AutoVacuumReleaseParallelWorkers(), which leds to skips the above
codes. For example, suppose that autovacuum_max_parallel_workers is 10
and 3 parallel workers are running by one autovacuum worker (i.e.,
av_freeParallelWorkers = 7 now), if the user changes
autovacuum_max_parallel_workers to 5, the autovacuum launchers adjust
av_freeParallelWorkers to 5. However, if the worker doesn't reload the
config file and executes AutoVacuumReleaseParallelWorkers(), it
increases av_freeParallelWorkers to 8 and skips the adjusting logic.
I've not tested this scenarios so I might be missing something though.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-20 16:43 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-07-20 16:43 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Sami Imseih <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Fri, Jul 18, 2025 at 2:43 AM Masahiko Sawada <[email protected]> wrote:
>
> On Mon, Jul 14, 2025 at 3:49 AM Daniil Davydov <[email protected]> wrote:
> >
> > This log level is used only "for messages about parallel workers launched".
> > I think that such logs relate more to the parallel workers module than
> > autovacuum itself. Moreover, if we emit log "planned vs. launched" each
> > time, it will simplify the task of selecting the optimal value of
> > 'autovacuum_max_parallel_workers' parameter. What do you think?
>
> INFO level is normally not sent to the server log. And regarding
> autovacuums, they don't write any log mentioning it started. If we
> want to write planned vs. launched I think it's better to gather these
> statistics during execution and write it together with other existing
> logs.
>
> >
> > About "%svacuum" - I guess we need to clarify what exactly the workers
> > were launched for. I'll add errhint to this log, but I don't know whether such
> > approach is acceptable.
>
> I'm not sure errhint is an appropriate place. If we write such
> information together with other existing autovacuum logs as I
> suggested above, I think we don't need to add such information to this
> log message.
>
I thought about it for some time and came up with this idea :
1)
When gathering such statistics, we need to take into account that users
might not want autovacuum to log something. Thus, we should collect statistics
in "higher" level that knows about log_min_duration.
2)
By analogy with the rest of the statistics, we can only accumulate a
total number
of planned and launched parallel workers. Alternatively, we could build an array
(one element for each index scan) of "planned vs. launched". But it will make
the code "dirty", and I don't sure that this will be useful.
This may be a discussion point, so I will separate it to another .patch file.
> I've reviewed v7 patch and here are some comments:
>
> + {
> + {
> + "parallel_autovacuum_workers",
> + "Maximum number of parallel autovacuum workers that can be
> taken from bgworkers pool for processing this table. "
> + "If value is 0 then parallel degree will computed based on
> number of indexes.",
> + RELOPT_KIND_HEAP,
> + ShareUpdateExclusiveLock
> + },
> + -1, -1, 1024
> + },
>
> Many autovacuum related reloptions have the prefix "autovacuum". So
> how about renaming it to autovacuum_parallel_worker (change
> check_parallel_av_gucs() name too accordingly)?
>
I have no objections.
> ---
> +bool
> +check_autovacuum_max_parallel_workers(int *newval, void **extra,
> + GucSource source)
> +{
> + if (*newval >= max_worker_processes)
> + return false;
> + return true;
> +}
>
> I think we don't need to strictly check the
> autovacuum_max_parallel_workers value. Instead, we can accept any
> integer value but internally cap by max_worker_processes.
>
I don't think that such a limitation is excessive, but I don't see similar
behavior in other "max_parallel_..." GUCs, so I think we can get
rid of it. I'll replace the "check hook" with an "assign hook", where
autovacuum_max_parallel_workers will be limited.
> ---
> +/*
> + * Make sure that number of available parallel workers corresponds to the
> + * autovacuum_max_parallel_workers parameter (after it was changed).
> + */
> +static void
> +check_parallel_av_gucs(int prev_max_parallel_workers)
> +{
>
> I think this function doesn't just check the value but does adjust the
> number of available workers, so how about
> adjust_free_parallel_workers() or something along these lines?
I agree, it's better this way.
>
> ---
> + /*
> + * Number of available workers must not exceed limit.
> + *
> + * Note, that if some parallel autovacuum workers are running at this
> + * moment, available workers number will not exceed limit after
> + * releasing them (see ParallelAutoVacuumReleaseWorkers).
> + */
> + AutoVacuumShmem->av_freeParallelWorkers =
> + autovacuum_max_parallel_workers;
>
> I think the comment refers to the following code in
> AutoVacuumReleaseParallelWorkers():
>
> + /*
> + * If autovacuum_max_parallel_workers variable was reduced during parallel
> + * autovacuum execution, we must cap available workers number by its new
> + * value.
> + */
> + if (AutoVacuumShmem->av_freeParallelWorkers >
> + autovacuum_max_parallel_workers)
> + {
> + AutoVacuumShmem->av_freeParallelWorkers =
> + autovacuum_max_parallel_workers;
> + }
>
> After the autovacuum launchers decreases av_freeParallelWorkers, it's
> not guaranteed that the autovacuum worker already reflects the new
> value from the config file when executing the
> AutoVacuumReleaseParallelWorkers(), which leds to skips the above
> codes. For example, suppose that autovacuum_max_parallel_workers is 10
> and 3 parallel workers are running by one autovacuum worker (i.e.,
> av_freeParallelWorkers = 7 now), if the user changes
> autovacuum_max_parallel_workers to 5, the autovacuum launchers adjust
> av_freeParallelWorkers to 5. However, if the worker doesn't reload the
> config file and executes AutoVacuumReleaseParallelWorkers(), it
> increases av_freeParallelWorkers to 8 and skips the adjusting logic.
> I've not tested this scenarios so I might be missing something though.
>
Yes, this is a possible scenario. I'll rework av_freeParallelWorkers
calculation. Main change is that a/v worker now checks whether config
reload is pending. Thus, it will have the relevant value of the
autovacuum_max_parallel_workers parameter.
Thank you very much for your comments! Please, see v8 patches:
1) Rename table option.
2) Replace check_hook with assign_hook for autovacuum_max_parallel_workers.
3) Simplify and correct logic for handling
autovacuum_max_parallel_workers parameter change.
4) Rework logic with "planned vs. launched" statistics for autovacuum
(see second patch file).
5) Get rid of "sandbox" - I don't see the point in continuing to drag him along.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v8-0002-Logging-for-parallel-autovacuum.patch (7.4K, 2-v8-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 27b2c7d0dfb193aadd9d0199647e5909de3ac0aa Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 20 Jul 2025 23:26:13 +0700
Subject: [PATCH v8 2/2] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 26 ++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
3 files changed, 52 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 14036c27e87..11dc2c48a7e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -348,6 +348,11 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -688,6 +693,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated.
+ * For now, this is used only by autovacuum leader worker, because it
+ * must log it in the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1012,6 +1027,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2634,7 +2654,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3047,7 +3068,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 38cd6f68105..831cc64b529 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -510,7 +510,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -521,7 +521,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -529,7 +529,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -541,7 +542,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -626,7 +627,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -750,6 +751,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 14eeccbd718..64b23687506 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -295,6 +295,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -389,11 +399,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
--
2.43.0
[text/x-patch] v8-0001-Parallel-index-autovacuum.patch (16.8K, 3-v8-0001-Parallel-index-autovacuum.patch)
download | inline diff:
From 74329dfbaebff1878c443d70b45aa1b5f7f2ef74 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 20 Jul 2025 23:03:57 +0700
Subject: [PATCH v8 1/2] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 12 ++
src/backend/commands/vacuumparallel.c | 46 ++++++-
src/backend/postmaster/autovacuum.c | 120 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 1 +
src/include/utils/rel.h | 2 +
10 files changed, 190 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..54abe7f21f5 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1882,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..38cd6f68105 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +439,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +459,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -553,12 +565,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_parallel_workers;
+
+ max_parallel_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_parallel_workers == 0)
return 0;
/*
@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_parallel_workers);
return parallel_workers;
}
@@ -646,6 +663,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Also reserve workers in autovacuum global state. Note, that we may be
+ * given fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +714,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9474095f271..61a50c9eca8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +356,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -753,6 +756,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +774,14 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2860,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3346,68 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ can_launch = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be
+ * able to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /* Refresh autovacuum_max_parallel_workers paremeter */
+ CHECK_FOR_INTERRUPTS();
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If autovacuum_max_parallel_workers parameter was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ autovacuum_max_parallel_workers);
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3468,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3520,12 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+void
+assign_autovacuum_max_parallel_workers(int newval, void *extra)
+{
+ autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3557,32 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..977644978c1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..4941ad976df 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ NULL, assign_autovacuum_max_parallel_workers, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..bbf5307000f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..863d206f2bd 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 82ac8646a8d..04833b4f147 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,7 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern void assign_autovacuum_max_parallel_workers(int newval, void *extra);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..377000199d7 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int autovacuum_parallel_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-21 16:40 Sami Imseih <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Sami Imseih @ 2025-07-21 16:40 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Thanks for the patches!
I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
have a few comments from my initial pass.
1/ Please run pgindent.
2/ Documentation is missing. There may be more, but here are the places I
found that likely need updates for the new behavior, reloptions, GUC, etc.
Including docs in the patch early would help clarify expected behavior.
https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
https://www.postgresql.org/docs/current/sql-createtable.html
https://www.postgresql.org/docs/current/sql-altertable.html
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WO...
One thing I am unclear on is the interaction between max_worker_processes,
max_parallel_workers, and max_parallel_maintenance_workers. For example, does
the following change mean that manual VACUUM PARALLEL is no longer capped by
max_parallel_maintenance_workers?
@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers,
max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_parallel_workers);
3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers
that can be taken from bgworkers pool for processing this table. "
4/ The comment "When parallel autovacuum worker die" suggests an abnormal
exit. "Terminates" seems clearer, since this applies to both normal and
abnormal exits.
instead of:
+ * When parallel autovacuum worker die,
how about this:
* When parallel autovacuum worker terminates,
5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
DestroyParallelContext?
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember
this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
AutoVacuumReleaseParallelWorkers()?
if (!AmAutoVacuumWorkerProcess())
return;
7/ It looks like the psql tab completion for autovacuum_parallel_workers is
missing:
test=# alter table t set (autovacuum_
autovacuum_analyze_scale_factor
autovacuum_analyze_threshold
autovacuum_enabled
autovacuum_freeze_max_age
autovacuum_freeze_min_age
autovacuum_freeze_table_age
autovacuum_multixact_freeze_max_age
autovacuum_multixact_freeze_min_age
autovacuum_multixact_freeze_table_age
autovacuum_vacuum_cost_delay
autovacuum_vacuum_cost_limit
autovacuum_vacuum_insert_scale_factor
autovacuum_vacuum_insert_threshold
autovacuum_vacuum_max_threshold
autovacuum_vacuum_scale_factor
autovacuum_vacuum_threshold
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-07-22 06:45 Daniil Davydov <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-07-22 06:45 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <[email protected]> wrote:
>
> I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
> have a few comments from my initial pass.
>
> 1/ Please run pgindent.
OK, I'll do it.
> 2/ Documentation is missing. There may be more, but here are the places I
> found that likely need updates for the new behavior, reloptions, GUC, etc.
> Including docs in the patch early would help clarify expected behavior.
>
> https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
> https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
> https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
> https://www.postgresql.org/docs/current/sql-createtable.html
> https://www.postgresql.org/docs/current/sql-altertable.html
> https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
> https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WO...
>
Thanks for gathering it all together. I'll update the documentation so
it will reflect changes in autovacuum daemon, reloptions and GUC
parameters. So far, I don't see what we can add to vacuum-basics
and alter-table paragraphs.
I'll create separate .patch file for changes in documentation.
> One thing I am unclear on is the interaction between max_worker_processes,
> max_parallel_workers, and max_parallel_maintenance_workers. For example, does
> the following change mean that manual VACUUM PARALLEL is no longer capped by
> max_parallel_maintenance_workers?
>
> @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
> int nindexes, int nrequested,
> parallel_workers = (nrequested > 0) ?
> Min(nrequested, nindexes_parallel) : nindexes_parallel;
>
> - /* Cap by max_parallel_maintenance_workers */
> - parallel_workers = Min(parallel_workers,
> max_parallel_maintenance_workers);
> + /* Cap by GUC variable */
> + parallel_workers = Min(parallel_workers, max_parallel_workers);
>
Oh, it is my poor choice of a name for a local variable (I'll rename it).
This variable can get different values depending on performed operation :
autovacuum_max_parallel_workers for parallel autovacuum and
max_parallel_maintenance_workers for maintenance VACUUM.
>
> 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
>
> + "autovacuum_parallel_workers",
> + "Maximum number of parallel autovacuum workers
> that can be taken from bgworkers pool for processing this table. "
>
I don't think that we should refer to max_parallel_workers here.
Actually, this reloption doesn't depend on max_parallel_workers at all.
I wrote about bgworkers pool (both here and in description of
autovacuum_max_parallel_workers parameter) in order to clarify that
parallel autovacuum will use dynamic workers instead of launching
more a/v workers.
BTW, I don't really like that the comment on this option turns out to be
very large. I'll leave only short description in reloptions.c and move
clarification about zero value in rel.h
Mentions of bgworkers pool will remain only in
description of autovacuum_max_parallel_workers.
> 4/ The comment "When parallel autovacuum worker die" suggests an abnormal
> exit. "Terminates" seems clearer, since this applies to both normal and
> abnormal exits.
>
> instead of:
> + * When parallel autovacuum worker die,
>
> how about this:
> * When parallel autovacuum worker terminates,
>
Sounds reasonable, I'll fix it.
>
> 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
> DestroyParallelContext?
>
> + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember
> this value */
> DestroyParallelContext(pvs->pcxt);
> +
> + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> + if (AmAutoVacuumWorkerProcess())
> + AutoVacuumReleaseParallelWorkers(nlaunched_workers);
>
I wrote about it above [1], but I think I can duplicate my thoughts here :
"""
Destroying parallel context includes waiting for all workers to exit (after
which, other operations can use them).
If we first call ParallelAutoVacuumReleaseWorkers, some operation can
reasonably request all released workers. But this request can fail,
because there is no guarantee that workers managed to finish.
Actually, there's nothing wrong with that, but I think releasing workers
only after finishing work is a more logical approach.
"""
>
> 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
> AutoVacuumReleaseParallelWorkers()?
>
> if (!AmAutoVacuumWorkerProcess())
> return;
>
It seems to me that the opposite is true. If there is no alternative to calling
AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts
will disappear after viewing the AmAutoVacuumWorkerProcess code,
but IMO code in vacuumparallel.c will become less intuitive.
> 7/ It looks like the psql tab completion for autovacuum_parallel_workers is
> missing:
>
> test=# alter table t set (autovacuum_
> autovacuum_analyze_scale_factor
> autovacuum_analyze_threshold
> autovacuum_enabled
> autovacuum_freeze_max_age
> autovacuum_freeze_min_age
> autovacuum_freeze_table_age
> autovacuum_multixact_freeze_max_age
> autovacuum_multixact_freeze_min_age
> autovacuum_multixact_freeze_table_age
> autovacuum_vacuum_cost_delay
> autovacuum_vacuum_cost_limit
> autovacuum_vacuum_insert_scale_factor
> autovacuum_vacuum_insert_threshold
> autovacuum_vacuum_max_threshold
> autovacuum_vacuum_scale_factor
> autovacuum_vacuum_threshold
>
Good catch, I'll fix it.
Thank you for the review! Please, see v9 patches :
1) Run pgindent + rebase patches on newest commit in master.
2) Introduce changes for documentation.
3) Rename local variable in parallel_vacuum_compute_workers.
4) Shorten the description of autovacuum_parallel_workers in
reloptions.c (move clarifications for it into rel.h).
5) Reword "When parallel autovacuum worker die" comment.
6) Add tab completion for autovacuum_parallel_workers table option.
[1] https://www.postgresql.org/message-id/CAJDiXgi7KB7wSQ%3DUx%3DngdaCvJnJ5x-ehvTyiuZez%2B5uKHtV6iQ%40ma...
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v9-0002-Logging-for-parallel-autovacuum.patch (7.4K, 2-v9-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 1c4c65cad27e2986962ef0d041bf4f332c58f668 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 22 Jul 2025 02:47:24 +0700
Subject: [PATCH v9 2/3] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
3 files changed, 53 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 14036c27e87..f1a645e79a9 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -348,6 +348,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -688,6 +694,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1012,6 +1028,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2634,7 +2655,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3047,7 +3069,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ffc140dabcf..51511bf2100 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage * wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -510,7 +510,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage * wusage)
{
Assert(!IsParallelWorker());
@@ -521,7 +521,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -529,7 +529,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage * wusage)
{
Assert(!IsParallelWorker());
@@ -541,7 +542,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -626,7 +627,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage * wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -750,6 +751,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 14eeccbd718..d05ef7461ea 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -295,6 +295,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -389,11 +399,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage * wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage * wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
--
2.43.0
[text/x-patch] v9-0003-Documentation-for-parallel-autovacuum.patch (4.4K, 3-v9-0003-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From de46c8232641e288a46e6af1799961c1b00a4655 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 22 Jul 2025 12:31:20 +0700
Subject: [PATCH v9 3/3] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c7acc0f182f..06b0aff6cb7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9187,6 +9188,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index e7a9f58c015..4e450ba9066 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -896,6 +896,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index dc000e913c1..288de6b0ffd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v9-0001-Parallel-index-autovacuum.patch (17.3K, 4-v9-0001-Parallel-index-autovacuum.patch)
download | inline diff:
From f5e21f44faa8618bfd575099317ecf06213f62f0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 20 Jul 2025 23:03:57 +0700
Subject: [PATCH v9 1/3] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 46 ++++++-
src/backend/postmaster/autovacuum.c | 121 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 1 +
src/include/utils/rel.h | 7 +
11 files changed, 196 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..cc3ffc43a05 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1881,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..ffc140dabcf 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +439,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +459,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -553,12 +565,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +663,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Also reserve workers in autovacuum global state. Note, that we may be
+ * given fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +714,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9474095f271..76eb04029a3 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +356,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -753,6 +756,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +774,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2861,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3347,68 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ can_launch = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker terminates, leader worker must call this
+ * function in order to refresh global autovacuum state. Thus, other leaders
+ * will be able to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /* Refresh autovacuum_max_parallel_workers paremeter */
+ CHECK_FOR_INTERRUPTS();
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If autovacuum_max_parallel_workers parameter was reduced during
+ * parallel autovacuum execution, we must cap available workers number by
+ * its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ autovacuum_max_parallel_workers);
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3469,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3521,12 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+void
+assign_autovacuum_max_parallel_workers(int newval, void *extra)
+{
+ autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3558,32 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..4941ad976df 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ NULL, assign_autovacuum_max_parallel_workers, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..bbf5307000f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 37524364290..3b3d4438e65 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1399,6 +1399,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..42d4a63d033 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 82ac8646a8d..04833b4f147 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,7 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern void assign_autovacuum_max_parallel_workers(int newval, void *extra);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..edd286808bf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-08-07 23:38 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-08-07 23:38 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, Jul 21, 2025 at 11:45 PM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <[email protected]> wrote:
> >
> > I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
> > have a few comments from my initial pass.
> >
> > 1/ Please run pgindent.
>
> OK, I'll do it.
>
> > 2/ Documentation is missing. There may be more, but here are the places I
> > found that likely need updates for the new behavior, reloptions, GUC, etc.
> > Including docs in the patch early would help clarify expected behavior.
> >
> > https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
> > https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
> > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
> > https://www.postgresql.org/docs/current/sql-createtable.html
> > https://www.postgresql.org/docs/current/sql-altertable.html
> > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
> > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WO...
> >
>
> Thanks for gathering it all together. I'll update the documentation so
> it will reflect changes in autovacuum daemon, reloptions and GUC
> parameters. So far, I don't see what we can add to vacuum-basics
> and alter-table paragraphs.
>
> I'll create separate .patch file for changes in documentation.
>
> > One thing I am unclear on is the interaction between max_worker_processes,
> > max_parallel_workers, and max_parallel_maintenance_workers. For example, does
> > the following change mean that manual VACUUM PARALLEL is no longer capped by
> > max_parallel_maintenance_workers?
> >
> > @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
> > int nindexes, int nrequested,
> > parallel_workers = (nrequested > 0) ?
> > Min(nrequested, nindexes_parallel) : nindexes_parallel;
> >
> > - /* Cap by max_parallel_maintenance_workers */
> > - parallel_workers = Min(parallel_workers,
> > max_parallel_maintenance_workers);
> > + /* Cap by GUC variable */
> > + parallel_workers = Min(parallel_workers, max_parallel_workers);
> >
>
> Oh, it is my poor choice of a name for a local variable (I'll rename it).
> This variable can get different values depending on performed operation :
> autovacuum_max_parallel_workers for parallel autovacuum and
> max_parallel_maintenance_workers for maintenance VACUUM.
>
> >
> > 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
> >
> > + "autovacuum_parallel_workers",
> > + "Maximum number of parallel autovacuum workers
> > that can be taken from bgworkers pool for processing this table. "
> >
>
> I don't think that we should refer to max_parallel_workers here.
> Actually, this reloption doesn't depend on max_parallel_workers at all.
> I wrote about bgworkers pool (both here and in description of
> autovacuum_max_parallel_workers parameter) in order to clarify that
> parallel autovacuum will use dynamic workers instead of launching
> more a/v workers.
>
> BTW, I don't really like that the comment on this option turns out to be
> very large. I'll leave only short description in reloptions.c and move
> clarification about zero value in rel.h
> Mentions of bgworkers pool will remain only in
> description of autovacuum_max_parallel_workers.
>
> > 4/ The comment "When parallel autovacuum worker die" suggests an abnormal
> > exit. "Terminates" seems clearer, since this applies to both normal and
> > abnormal exits.
> >
> > instead of:
> > + * When parallel autovacuum worker die,
> >
> > how about this:
> > * When parallel autovacuum worker terminates,
> >
>
> Sounds reasonable, I'll fix it.
>
> >
> > 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
> > DestroyParallelContext?
> >
> > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember
> > this value */
> > DestroyParallelContext(pvs->pcxt);
> > +
> > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> > + if (AmAutoVacuumWorkerProcess())
> > + AutoVacuumReleaseParallelWorkers(nlaunched_workers);
> >
>
> I wrote about it above [1], but I think I can duplicate my thoughts here :
> """
> Destroying parallel context includes waiting for all workers to exit (after
> which, other operations can use them).
> If we first call ParallelAutoVacuumReleaseWorkers, some operation can
> reasonably request all released workers. But this request can fail,
> because there is no guarantee that workers managed to finish.
>
> Actually, there's nothing wrong with that, but I think releasing workers
> only after finishing work is a more logical approach.
> """
>
> >
> > 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
> > AutoVacuumReleaseParallelWorkers()?
> >
> > if (!AmAutoVacuumWorkerProcess())
> > return;
> >
>
> It seems to me that the opposite is true. If there is no alternative to calling
> AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts
> will disappear after viewing the AmAutoVacuumWorkerProcess code,
> but IMO code in vacuumparallel.c will become less intuitive.
>
> > 7/ It looks like the psql tab completion for autovacuum_parallel_workers is
> > missing:
> >
> > test=# alter table t set (autovacuum_
> > autovacuum_analyze_scale_factor
> > autovacuum_analyze_threshold
> > autovacuum_enabled
> > autovacuum_freeze_max_age
> > autovacuum_freeze_min_age
> > autovacuum_freeze_table_age
> > autovacuum_multixact_freeze_max_age
> > autovacuum_multixact_freeze_min_age
> > autovacuum_multixact_freeze_table_age
> > autovacuum_vacuum_cost_delay
> > autovacuum_vacuum_cost_limit
> > autovacuum_vacuum_insert_scale_factor
> > autovacuum_vacuum_insert_threshold
> > autovacuum_vacuum_max_threshold
> > autovacuum_vacuum_scale_factor
> > autovacuum_vacuum_threshold
> >
>
> Good catch, I'll fix it.
>
> Thank you for the review! Please, see v9 patches :
> 1) Run pgindent + rebase patches on newest commit in master.
> 2) Introduce changes for documentation.
> 3) Rename local variable in parallel_vacuum_compute_workers.
> 4) Shorten the description of autovacuum_parallel_workers in
> reloptions.c (move clarifications for it into rel.h).
> 5) Reword "When parallel autovacuum worker die" comment.
> 6) Add tab completion for autovacuum_parallel_workers table option.
Thank you for updating the patch. Here are some review comments.
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
+
We release the reserved worker in parallel_vacuum_end(). However,
parallel_vacuum_end() is called only once at the end of vacuum. I
think we need to release the reserved worker after index vacuuming or
cleanup, otherwise we would end up holding the reserved workers until
the end of vacuum even if we invoke index vacuuming multiple times.
---
+void
+assign_autovacuum_max_parallel_workers(int newval, void *extra)
+{
+ autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
+}
I don't think we need the assign hook for this GUC parameter. We can
internally cap the maximum value by max_worker_processes like other
GUC parameters such as max_parallel_maintenance_workers and
max_parallel_workers.
---+ /* Refresh autovacuum_max_parallel_workers paremeter */
+ CHECK_FOR_INTERRUPTS();
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If autovacuum_max_parallel_workers parameter was reduced during
+ * parallel autovacuum execution, we must cap available
workers number by
+ * its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ autovacuum_max_parallel_workers);
+
+ LWLockRelease(AutovacuumLock);
I think another race condition could occur; suppose
autovacuum_max_parallel_workers is set to '5' and one autovacuum
worker reserved 5 workers, meaning that
AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
autovacuum_max_parallel_workers to 3 and reloads the conf file right
after the autovacuum worker checks the interruption. The launcher
processes calls adjust_free_parallel_workers() but
av_freeParallelWorkers remains 0, and the autovacuum worker increments
it by 5 as its autovacuum_max_parallel_workers value is still 5.
I think that we can have the autovacuum_max_parallel_workers value on
shmem, and only the launcher process can modify its value if the GUC
is changed. Autovacuum workers simply increase or decrease the
av_freeParallelWorkers within the range of 0 and the
autovacuum_max_parallel_workers value on shmem. When changing
autovacuum_max_parallel_workers and av_freeParallelWorkers values on
shmem, the launcher process calculates the number of workers reserved
at that time and calculate the new av_freeParallelWorkers value by
subtracting the new autovacuum_max_parallel_workers by the number of
reserved workers.
---
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int can_launch;
How about renaming it to 'nreserved' or something? can_launch looks
like it's a boolean variable to indicate whether the process can
launch workers.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-08-14 20:40 Masahiko Sawada <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-08-14 20:40 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Thu, Aug 7, 2025 at 4:38 PM Masahiko Sawada <[email protected]> wrote:
>
> On Mon, Jul 21, 2025 at 11:45 PM Daniil Davydov <[email protected]> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <[email protected]> wrote:
> > >
> > > I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
> > > have a few comments from my initial pass.
> > >
> > > 1/ Please run pgindent.
> >
> > OK, I'll do it.
> >
> > > 2/ Documentation is missing. There may be more, but here are the places I
> > > found that likely need updates for the new behavior, reloptions, GUC, etc.
> > > Including docs in the patch early would help clarify expected behavior.
> > >
> > > https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
> > > https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
> > > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
> > > https://www.postgresql.org/docs/current/sql-createtable.html
> > > https://www.postgresql.org/docs/current/sql-altertable.html
> > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
> > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WO...
> > >
> >
> > Thanks for gathering it all together. I'll update the documentation so
> > it will reflect changes in autovacuum daemon, reloptions and GUC
> > parameters. So far, I don't see what we can add to vacuum-basics
> > and alter-table paragraphs.
> >
> > I'll create separate .patch file for changes in documentation.
> >
> > > One thing I am unclear on is the interaction between max_worker_processes,
> > > max_parallel_workers, and max_parallel_maintenance_workers. For example, does
> > > the following change mean that manual VACUUM PARALLEL is no longer capped by
> > > max_parallel_maintenance_workers?
> > >
> > > @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
> > > int nindexes, int nrequested,
> > > parallel_workers = (nrequested > 0) ?
> > > Min(nrequested, nindexes_parallel) : nindexes_parallel;
> > >
> > > - /* Cap by max_parallel_maintenance_workers */
> > > - parallel_workers = Min(parallel_workers,
> > > max_parallel_maintenance_workers);
> > > + /* Cap by GUC variable */
> > > + parallel_workers = Min(parallel_workers, max_parallel_workers);
> > >
> >
> > Oh, it is my poor choice of a name for a local variable (I'll rename it).
> > This variable can get different values depending on performed operation :
> > autovacuum_max_parallel_workers for parallel autovacuum and
> > max_parallel_maintenance_workers for maintenance VACUUM.
> >
> > >
> > > 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
> > >
> > > + "autovacuum_parallel_workers",
> > > + "Maximum number of parallel autovacuum workers
> > > that can be taken from bgworkers pool for processing this table. "
> > >
> >
> > I don't think that we should refer to max_parallel_workers here.
> > Actually, this reloption doesn't depend on max_parallel_workers at all.
> > I wrote about bgworkers pool (both here and in description of
> > autovacuum_max_parallel_workers parameter) in order to clarify that
> > parallel autovacuum will use dynamic workers instead of launching
> > more a/v workers.
> >
> > BTW, I don't really like that the comment on this option turns out to be
> > very large. I'll leave only short description in reloptions.c and move
> > clarification about zero value in rel.h
> > Mentions of bgworkers pool will remain only in
> > description of autovacuum_max_parallel_workers.
> >
> > > 4/ The comment "When parallel autovacuum worker die" suggests an abnormal
> > > exit. "Terminates" seems clearer, since this applies to both normal and
> > > abnormal exits.
> > >
> > > instead of:
> > > + * When parallel autovacuum worker die,
> > >
> > > how about this:
> > > * When parallel autovacuum worker terminates,
> > >
> >
> > Sounds reasonable, I'll fix it.
> >
> > >
> > > 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
> > > DestroyParallelContext?
> > >
> > > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember
> > > this value */
> > > DestroyParallelContext(pvs->pcxt);
> > > +
> > > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> > > + if (AmAutoVacuumWorkerProcess())
> > > + AutoVacuumReleaseParallelWorkers(nlaunched_workers);
> > >
> >
> > I wrote about it above [1], but I think I can duplicate my thoughts here :
> > """
> > Destroying parallel context includes waiting for all workers to exit (after
> > which, other operations can use them).
> > If we first call ParallelAutoVacuumReleaseWorkers, some operation can
> > reasonably request all released workers. But this request can fail,
> > because there is no guarantee that workers managed to finish.
> >
> > Actually, there's nothing wrong with that, but I think releasing workers
> > only after finishing work is a more logical approach.
> > """
> >
> > >
> > > 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
> > > AutoVacuumReleaseParallelWorkers()?
> > >
> > > if (!AmAutoVacuumWorkerProcess())
> > > return;
> > >
> >
> > It seems to me that the opposite is true. If there is no alternative to calling
> > AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts
> > will disappear after viewing the AmAutoVacuumWorkerProcess code,
> > but IMO code in vacuumparallel.c will become less intuitive.
> >
> > > 7/ It looks like the psql tab completion for autovacuum_parallel_workers is
> > > missing:
> > >
> > > test=# alter table t set (autovacuum_
> > > autovacuum_analyze_scale_factor
> > > autovacuum_analyze_threshold
> > > autovacuum_enabled
> > > autovacuum_freeze_max_age
> > > autovacuum_freeze_min_age
> > > autovacuum_freeze_table_age
> > > autovacuum_multixact_freeze_max_age
> > > autovacuum_multixact_freeze_min_age
> > > autovacuum_multixact_freeze_table_age
> > > autovacuum_vacuum_cost_delay
> > > autovacuum_vacuum_cost_limit
> > > autovacuum_vacuum_insert_scale_factor
> > > autovacuum_vacuum_insert_threshold
> > > autovacuum_vacuum_max_threshold
> > > autovacuum_vacuum_scale_factor
> > > autovacuum_vacuum_threshold
> > >
> >
> > Good catch, I'll fix it.
> >
> > Thank you for the review! Please, see v9 patches :
> > 1) Run pgindent + rebase patches on newest commit in master.
> > 2) Introduce changes for documentation.
> > 3) Rename local variable in parallel_vacuum_compute_workers.
> > 4) Shorten the description of autovacuum_parallel_workers in
> > reloptions.c (move clarifications for it into rel.h).
> > 5) Reword "When parallel autovacuum worker die" comment.
> > 6) Add tab completion for autovacuum_parallel_workers table option.
>
> Thank you for updating the patch. Here are some review comments.
>
> + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> + if (AmAutoVacuumWorkerProcess())
> + AutoVacuumReleaseParallelWorkers(nlaunched_workers);
> +
>
> We release the reserved worker in parallel_vacuum_end(). However,
> parallel_vacuum_end() is called only once at the end of vacuum. I
> think we need to release the reserved worker after index vacuuming or
> cleanup, otherwise we would end up holding the reserved workers until
> the end of vacuum even if we invoke index vacuuming multiple times.
>
> ---
> +void
> +assign_autovacuum_max_parallel_workers(int newval, void *extra)
> +{
> + autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
> +}
>
> I don't think we need the assign hook for this GUC parameter. We can
> internally cap the maximum value by max_worker_processes like other
> GUC parameters such as max_parallel_maintenance_workers and
> max_parallel_workers.
>
> ---+ /* Refresh autovacuum_max_parallel_workers paremeter */
> + CHECK_FOR_INTERRUPTS();
> + if (ConfigReloadPending)
> + {
> + ConfigReloadPending = false;
> + ProcessConfigFile(PGC_SIGHUP);
> + }
> +
> + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> +
> + /*
> + * If autovacuum_max_parallel_workers parameter was reduced during
> + * parallel autovacuum execution, we must cap available
> workers number by
> + * its new value.
> + */
> + AutoVacuumShmem->av_freeParallelWorkers =
> + Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
> + autovacuum_max_parallel_workers);
> +
> + LWLockRelease(AutovacuumLock);
>
> I think another race condition could occur; suppose
> autovacuum_max_parallel_workers is set to '5' and one autovacuum
> worker reserved 5 workers, meaning that
> AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
> autovacuum_max_parallel_workers to 3 and reloads the conf file right
> after the autovacuum worker checks the interruption. The launcher
> processes calls adjust_free_parallel_workers() but
> av_freeParallelWorkers remains 0, and the autovacuum worker increments
> it by 5 as its autovacuum_max_parallel_workers value is still 5.
>
> I think that we can have the autovacuum_max_parallel_workers value on
> shmem, and only the launcher process can modify its value if the GUC
> is changed. Autovacuum workers simply increase or decrease the
> av_freeParallelWorkers within the range of 0 and the
> autovacuum_max_parallel_workers value on shmem. When changing
> autovacuum_max_parallel_workers and av_freeParallelWorkers values on
> shmem, the launcher process calculates the number of workers reserved
> at that time and calculate the new av_freeParallelWorkers value by
> subtracting the new autovacuum_max_parallel_workers by the number of
> reserved workers.
>
> ---
> +AutoVacuumReserveParallelWorkers(int nworkers)
> +{
> + int can_launch;
>
> How about renaming it to 'nreserved' or something? can_launch looks
> like it's a boolean variable to indicate whether the process can
> launch workers.
While testing the patch, I found there are other two problems:
1. when an autovacuum worker who reserved workers fails with an error,
the reserved workers are not released. I think we need to ensure that
all reserved workers are surely released at the end of vacuum even
with an error.
2. when an autovacuum worker (not parallel vacuum worker) who uses
parallel vacuum gets SIGHUP, it errors out with the error message
"parameter "max_stack_depth" cannot be set during a parallel
operation". Autovacuum checks the configuration file reload in
vacuum_delay_point(), and while reloading the configuration file, it
attempts to set max_stack_depth in
InitializeGUCOptionsFromEnvironment() (which is called by
ProcessConfigFileInternal()). However, it cannot change
max_stack_depth since the worker is in parallel mode but
max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
happen in regular backends who are using parallel queries because they
check the configuration file reload at the end of each SQL command.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-08-18 08:30 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-08-18 08:30 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
Thank you very much for your comments!
In this letter I'll answer both of your recent letters.
On Fri, Aug 8, 2025 at 6:38 AM Masahiko Sawada <[email protected]> wrote:
>
> Thank you for updating the patch. Here are some review comments.
>
> + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> + if (AmAutoVacuumWorkerProcess())
> + AutoVacuumReleaseParallelWorkers(nlaunched_workers);
> +
>
> We release the reserved worker in parallel_vacuum_end(). However,
> parallel_vacuum_end() is called only once at the end of vacuum. I
> think we need to release the reserved worker after index vacuuming or
> cleanup, otherwise we would end up holding the reserved workers until
> the end of vacuum even if we invoke index vacuuming multiple times.
>
Yep, you are right. It was easy to miss because typically the autovacuum
takes only one cycle to process a table. Since both index vacuum and
index cleanup uses the parallel_vacuum_process_all_indexes function,
I think that both releasing and reserving should be placed there.
> ---
> +void
> +assign_autovacuum_max_parallel_workers(int newval, void *extra)
> +{
> + autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
> +}
>
> I don't think we need the assign hook for this GUC parameter. We can
> internally cap the maximum value by max_worker_processes like other
> GUC parameters such as max_parallel_maintenance_workers and
> max_parallel_workers.
Ok, I get it - we don't want to give a configuration error for no serious
reason. Actually, we already internally capping
autovacuum_max_parallel_workers by max_worker_processes (inside
parallel_vacuum_compute_workers function). This is the same behavior
as max_parallel_maintenance_workers got.
I'll get rid of the assign hook and add one more capping inside autovacuum
shmem initialization : Since max_worker_processes is PGC_POSTMASTER
parameter, av_freeParallelWorkers must not exceed its value.
>
> ---+ /* Refresh autovacuum_max_parallel_workers paremeter */
> + CHECK_FOR_INTERRUPTS();
> + if (ConfigReloadPending)
> + {
> + ConfigReloadPending = false;
> + ProcessConfigFile(PGC_SIGHUP);
> + }
> +
> + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> +
> + /*
> + * If autovacuum_max_parallel_workers parameter was reduced during
> + * parallel autovacuum execution, we must cap available
> workers number by
> + * its new value.
> + */
> + AutoVacuumShmem->av_freeParallelWorkers =
> + Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
> + autovacuum_max_parallel_workers);
> +
> + LWLockRelease(AutovacuumLock);
>
> I think another race condition could occur; suppose
> autovacuum_max_parallel_workers is set to '5' and one autovacuum
> worker reserved 5 workers, meaning that
> AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
> autovacuum_max_parallel_workers to 3 and reloads the conf file right
> after the autovacuum worker checks the interruption. The launcher
> processes calls adjust_free_parallel_workers() but
> av_freeParallelWorkers remains 0, and the autovacuum worker increments
> it by 5 as its autovacuum_max_parallel_workers value is still 5.
>
I think this problem can be solved if we put AutovacuumLock acquiring
before processing the config file, but I understand that this is a bad way.
> I think that we can have the autovacuum_max_parallel_workers value on
> shmem, and only the launcher process can modify its value if the GUC
> is changed. Autovacuum workers simply increase or decrease the
> av_freeParallelWorkers within the range of 0 and the
> autovacuum_max_parallel_workers value on shmem. When changing
> autovacuum_max_parallel_workers and av_freeParallelWorkers values on
> shmem, the launcher process calculates the number of workers reserved
> at that time and calculate the new av_freeParallelWorkers value by
> subtracting the new autovacuum_max_parallel_workers by the number of
> reserved workers.
>
Good idea, I agree. Replacing the GUC parameter with the variable in shmem
leaves the current logic of free workers management unchanged. Essentially,
this is the same solution as I described above, but we are holding lock not
during config reloading, but during a simple value check. It makes much
more sense.
> ---
> +AutoVacuumReserveParallelWorkers(int nworkers)
> +{
> + int can_launch;
>
> How about renaming it to 'nreserved' or something? can_launch looks
> like it's a boolean variable to indicate whether the process can
> launch workers.
>
There are no objections.
On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <[email protected]> wrote:
>
> While testing the patch, I found there are other two problems:
>
> 1. when an autovacuum worker who reserved workers fails with an error,
> the reserved workers are not released. I think we need to ensure that
> all reserved workers are surely released at the end of vacuum even
> with an error.
>
Agree. I'll add a try/catch block to the parallel_vacuum_process_all_indexes
(the only place where we are reserving workers).
> 2. when an autovacuum worker (not parallel vacuum worker) who uses
> parallel vacuum gets SIGHUP, it errors out with the error message
> "parameter "max_stack_depth" cannot be set during a parallel
> operation". Autovacuum checks the configuration file reload in
> vacuum_delay_point(), and while reloading the configuration file, it
> attempts to set max_stack_depth in
> InitializeGUCOptionsFromEnvironment() (which is called by
> ProcessConfigFileInternal()). However, it cannot change
> max_stack_depth since the worker is in parallel mode but
> max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
> happen in regular backends who are using parallel queries because they
> check the configuration file reload at the end of each SQL command.
>
Hm, this is a really serious problem. I see only two ways to solve it (both are
not really good) :
1)
Do not allow processing of the config file during parallel autovacuum
execution.
2)
Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
phase. I'm a bit wary about it, because the design says that we should
be in parallel
mode during the whole parallel operation. But actually, if we can make
sure that all
launched workers are exited, I don't see reasons, why can't we just
exit parallel mode
at the end of parallel_vacuum_process_all_indexes.
What do you think about it? By now, I haven't made any changes related
to this problem.
Again, thank you for the review. Please, see v10 patches (only 0001
has been changed) :
1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
release workers even after an error. This required adding a static
variable to account
for the total number of reserved workers (av_nworkers_reserved).
3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
autovacuum code. Assign hook has been removed.
4) Use shmem value for determining the maximum number of parallel autovacuum
workers (eliminate race condition between launcher and leader process).
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v10-0002-Logging-for-parallel-autovacuum.patch (8.2K, 2-v10-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From e991e071d4798e8c2ec576389f5a8592fe76282b Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 18 Aug 2025 15:14:25 +0700
Subject: [PATCH v10 2/3] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 ++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 28 ++++++++++++++++++---------
src/include/commands/vacuum.h | 16 +++++++++++++--
3 files changed, 58 insertions(+), 13 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 14036c27e87..f1a645e79a9 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -348,6 +348,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -688,6 +694,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1012,6 +1028,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2634,7 +2655,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3047,7 +3069,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 4221e6084f5..02870ed1288 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,9 +227,10 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage * wusage);
static void parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
- int num_index_scans, bool vacuum);
+ int num_index_scans, bool vacuum,
+ PVWorkersUsage * wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -504,7 +505,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage * wusage)
{
Assert(!IsParallelWorker());
@@ -515,7 +516,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -523,7 +524,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage * wusage)
{
Assert(!IsParallelWorker());
@@ -535,7 +537,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -620,7 +622,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage * wusage)
{
/*
* Parallel autovacuum can reserve parallel workers. Use try/catch block
@@ -629,7 +631,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
PG_TRY();
{
parallel_vacuum_process_all_indexes_internal(pvs, num_index_scans,
- false);
+ false, wusage);
}
PG_CATCH();
{
@@ -644,7 +646,8 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
static void
parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
- int num_index_scans, bool vacuum)
+ int num_index_scans, bool vacuum,
+ PVWorkersUsage * wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -768,6 +771,13 @@ parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 14eeccbd718..d05ef7461ea 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -295,6 +295,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -389,11 +399,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage * wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage * wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
--
2.43.0
[text/x-patch] v10-0003-Documentation-for-parallel-autovacuum.patch (4.4K, 3-v10-0003-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 62abb120d888a837e50bb55ba26ba740caad8f7a Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 22 Jul 2025 12:31:20 +0700
Subject: [PATCH v10 3/3] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 20ccb2d6b54..b74053281de 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9189,6 +9190,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index e7a9f58c015..4e450ba9066 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -896,6 +896,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index dc000e913c1..288de6b0ffd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v10-0001-Parallel-index-autovacuum.patch (18.7K, 4-v10-0001-Parallel-index-autovacuum.patch)
download | inline diff:
From a470d95603b437ef5aa45470ad7be61f03682493 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 20 Jul 2025 23:03:57 +0700
Subject: [PATCH v10 1/3] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 68 ++++++++-
src/backend/postmaster/autovacuum.c | 144 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
10 files changed, 241 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0af3fea68fa..1c98d43c6eb 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1881,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..4221e6084f5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -225,6 +228,8 @@ static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
bool vacuum);
+static void parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
+ int num_index_scans, bool vacuum);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -373,8 +378,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +559,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +608,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -610,6 +621,30 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
bool vacuum)
+{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Use try/catch block
+ * to make ensure that all workers are released.
+ */
+ PG_TRY();
+ {
+ parallel_vacuum_process_all_indexes_internal(pvs, num_index_scans,
+ false);
+ }
+ PG_CATCH();
+ {
+ /* Release all reserved parallel workers, if any. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseAllParallelWorkers();
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+}
+
+static void
+parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
+ int num_index_scans, bool vacuum)
{
int nworkers;
PVIndVacStatus new_status;
@@ -646,6 +681,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +732,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +790,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ff96b36d710..78ceac67319 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autovacuum_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -763,6 +774,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +792,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2871,8 +2893,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3353,6 +3379,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3413,6 +3518,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3494,3 +3603,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..9ecb14227e5 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ NULL, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..bbf5307000f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8b10f2313f3..290dd5cb8ec 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1402,6 +1402,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..904c5ce37d8 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..edd286808bf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-08-18 21:03 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-08-18 21:03 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <[email protected]> wrote:
>
>
> On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <[email protected]> wrote:
> >
>
> > 2. when an autovacuum worker (not parallel vacuum worker) who uses
> > parallel vacuum gets SIGHUP, it errors out with the error message
> > "parameter "max_stack_depth" cannot be set during a parallel
> > operation". Autovacuum checks the configuration file reload in
> > vacuum_delay_point(), and while reloading the configuration file, it
> > attempts to set max_stack_depth in
> > InitializeGUCOptionsFromEnvironment() (which is called by
> > ProcessConfigFileInternal()). However, it cannot change
> > max_stack_depth since the worker is in parallel mode but
> > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
> > happen in regular backends who are using parallel queries because they
> > check the configuration file reload at the end of each SQL command.
> >
>
> Hm, this is a really serious problem. I see only two ways to solve it (both are
> not really good) :
> 1)
> Do not allow processing of the config file during parallel autovacuum
> execution.
>
> 2)
> Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
> phase. I'm a bit wary about it, because the design says that we should
> be in parallel
> mode during the whole parallel operation. But actually, if we can make
> sure that all
> launched workers are exited, I don't see reasons, why can't we just
> exit parallel mode
> at the end of parallel_vacuum_process_all_indexes.
>
> What do you think about it?
Hmm, given that we're trying to support parallel heap vacuum on
another thread[1] and we will probably support it in autovacuums, it
seems to me that these approaches won't work.
Another idea would be to allow autovacuum workers to process the
config file even in parallel mode. GUC changes in the leader worker
would not affect parallel vacuum workers, but it is fine to me. In the
context of autovacuum, only specific GUC parameters related to
cost-based delays need to be affected also to parallel vacuum workers.
Probably we need some changes to compute_parallel_delay() so that
parallel workers can compute the sleep time based on the new
vacuum_cost_limit and vacuum_cost_delay after the leader process
(i.e., autovacuum worker) reloads the config file.
>
> Again, thank you for the review. Please, see v10 patches (only 0001
> has been changed) :
> 1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
> 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
> release workers even after an error. This required adding a static
> variable to account
> for the total number of reserved workers (av_nworkers_reserved).
> 3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
> autovacuum code. Assign hook has been removed.
> 4) Use shmem value for determining the maximum number of parallel autovacuum
> workers (eliminate race condition between launcher and leader process).
Thank you for updating the patch! I'll review the new version patches.
Regards,
[1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.g...
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-09-15 18:50 Alexander Korotkov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Alexander Korotkov @ 2025-09-15 18:50 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi!
On Tue, Aug 19, 2025 at 12:04 AM Masahiko Sawada <[email protected]>
wrote:
>
> On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <[email protected]>
wrote:
> >
> >
> > On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <[email protected]>
wrote:
> > >
> >
> > > 2. when an autovacuum worker (not parallel vacuum worker) who uses
> > > parallel vacuum gets SIGHUP, it errors out with the error message
> > > "parameter "max_stack_depth" cannot be set during a parallel
> > > operation". Autovacuum checks the configuration file reload in
> > > vacuum_delay_point(), and while reloading the configuration file, it
> > > attempts to set max_stack_depth in
> > > InitializeGUCOptionsFromEnvironment() (which is called by
> > > ProcessConfigFileInternal()). However, it cannot change
> > > max_stack_depth since the worker is in parallel mode but
> > > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
> > > happen in regular backends who are using parallel queries because they
> > > check the configuration file reload at the end of each SQL command.
> > >
> >
> > Hm, this is a really serious problem. I see only two ways to solve it
(both are
> > not really good) :
> > 1)
> > Do not allow processing of the config file during parallel autovacuum
> > execution.
> >
> > 2)
> > Teach the autovacuum to enter parallel mode only during the index
vacuum/cleanup
> > phase. I'm a bit wary about it, because the design says that we should
> > be in parallel
> > mode during the whole parallel operation. But actually, if we can make
> > sure that all
> > launched workers are exited, I don't see reasons, why can't we just
> > exit parallel mode
> > at the end of parallel_vacuum_process_all_indexes.
> >
> > What do you think about it?
>
> Hmm, given that we're trying to support parallel heap vacuum on
> another thread[1] and we will probably support it in autovacuums, it
> seems to me that these approaches won't work.
>
> Another idea would be to allow autovacuum workers to process the
> config file even in parallel mode. GUC changes in the leader worker
> would not affect parallel vacuum workers, but it is fine to me. In the
> context of autovacuum, only specific GUC parameters related to
> cost-based delays need to be affected also to parallel vacuum workers.
> Probably we need some changes to compute_parallel_delay() so that
> parallel workers can compute the sleep time based on the new
> vacuum_cost_limit and vacuum_cost_delay after the leader process
> (i.e., autovacuum worker) reloads the config file.
>
> >
> > Again, thank you for the review. Please, see v10 patches (only 0001
> > has been changed) :
> > 1) Reserve and release workers only inside
parallel_vacuum_process_all_indexes.
> > 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so
we can
> > release workers even after an error. This required adding a static
> > variable to account
> > for the total number of reserved workers (av_nworkers_reserved).
> > 3) Cap autovacuum_max_parallel_workers by max_worker_processes only
inside
> > autovacuum code. Assign hook has been removed.
> > 4) Use shmem value for determining the maximum number of parallel
autovacuum
> > workers (eliminate race condition between launcher and leader process).
>
> Thank you for updating the patch! I'll review the new version patches.
I've rebased this patchset to the current master. That required me to move
the new GUC definition to guc_parameters.dat. Also, I adjusted
typedefs.list and made pgindent. Some notes about the patch.
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used
for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
Should we use MAX_PARALLEL_WORKER_LIMIT instead of hard-coded 1024 here?
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
Not sure about the usage of word "future" here. It doesn't look clear what
it means. Could we use "below" or "within this file"?
I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I
heard, the overhead of setting/doing jumps is platform-dependent, and not
harmless on some platforms. Therefore, can we skip TRY/CATCH block for
non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(),
that would save us from repeatedly setting a jump in autovacuum workers too.
In general, I think this patchset badly lack of testing. I think it needs
tap tests checking from the logs that autovacuum has been done in
parallel. Also, it would be good to set up some injection points, and
check that reserved autovacuum parallel workers are getting released
correctly in the case of errors.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
[application/octet-stream] v11-0001-Parallel-index-autovacuum.patch (18.9K, 3-v11-0001-Parallel-index-autovacuum.patch)
download | inline diff:
From c40bfce2f812370315ca9ea735b9d3d31384d4d2 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <[email protected]>
Date: Mon, 15 Sep 2025 21:12:01 +0300
Subject: [PATCH v11 1/3] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 68 ++++++++-
src/backend/postmaster/autovacuum.c | 144 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_parameters.dat | 9 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
10 files changed, 240 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0af3fea68fa..1c98d43c6eb 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1881,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..4221e6084f5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -225,6 +228,8 @@ static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
bool vacuum);
+static void parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
+ int num_index_scans, bool vacuum);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -373,8 +378,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +559,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +608,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -610,6 +621,30 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
bool vacuum)
+{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Use try/catch block
+ * to make ensure that all workers are released.
+ */
+ PG_TRY();
+ {
+ parallel_vacuum_process_all_indexes_internal(pvs, num_index_scans,
+ false);
+ }
+ PG_CATCH();
+ {
+ /* Release all reserved parallel workers, if any. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseAllParallelWorkers();
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+}
+
+static void
+parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
+ int num_index_scans, bool vacuum)
{
int nworkers;
PVIndVacStatus new_status;
@@ -646,6 +681,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +732,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +790,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index dce4c8c45b9..2bcd2ceb2a9 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -150,6 +150,12 @@ int Log_autovacuum_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -284,6 +290,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -298,6 +306,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -363,6 +373,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -762,6 +773,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -778,6 +791,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2870,8 +2892,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3352,6 +3378,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3412,6 +3517,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3493,3 +3602,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 6bc6be13d2a..1926218558a 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2112,6 +2112,15 @@
max => 'MAX_BACKENDS',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'max_parallel_maintenance_workers', type => 'int', context => 'PGC_USERSET', group => 'RESOURCES_WORKER_PROCESSES',
short_desc => 'Sets the maximum number of parallel processes per maintenance operation.',
variable => 'max_parallel_maintenance_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c36fcb9ab61..d277fef1735 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -684,6 +684,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 6b20a4404b2..0fb04e08c5d 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1402,6 +1402,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..904c5ce37d8 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..edd286808bf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.39.5 (Apple Git-154)
[application/octet-stream] v11-0003-Documentation-for-parallel-autovacuum.patch (4.4K, 4-v11-0003-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 45c18534682dc4dff219518e2112b4861e3f6baf Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 22 Jul 2025 12:31:20 +0700
Subject: [PATCH v11 3/3] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e9b420f3ddb..ffab6c6bea9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9196,6 +9197,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index e7a9f58c015..4e450ba9066 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -896,6 +896,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index dc000e913c1..288de6b0ffd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.39.5 (Apple Git-154)
[application/octet-stream] v11-0002-Logging-for-parallel-autovacuum.patch (8.6K, 5-v11-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From af2040cb5408f3876748f95dd8ee055358314caa Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 18 Aug 2025 15:14:25 +0700
Subject: [PATCH v11 2/3] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 ++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 28 ++++++++++++++++++---------
src/include/commands/vacuum.h | 16 +++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 59 insertions(+), 13 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 981d9380a92..6fe84d8747a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -687,6 +693,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1011,6 +1027,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2639,7 +2660,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3052,7 +3074,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 4221e6084f5..cada1722b76 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,9 +227,10 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
- int num_index_scans, bool vacuum);
+ int num_index_scans, bool vacuum,
+ PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -504,7 +505,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -515,7 +516,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -523,7 +524,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -535,7 +537,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -620,7 +622,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
/*
* Parallel autovacuum can reserve parallel workers. Use try/catch block
@@ -629,7 +631,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
PG_TRY();
{
parallel_vacuum_process_all_indexes_internal(pvs, num_index_scans,
- false);
+ false, wusage);
}
PG_CATCH();
{
@@ -644,7 +646,8 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
static void
parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
- int num_index_scans, bool vacuum)
+ int num_index_scans, bool vacuum,
+ PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -768,6 +771,13 @@ parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 14eeccbd718..0829a9658f2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -295,6 +295,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -389,11 +399,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..6f9c418689c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2366,6 +2366,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.39.5 (Apple Git-154)
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-09-16 18:30 Masahiko Sawada <[email protected]>
parent: Alexander Korotkov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-09-16 18:30 UTC (permalink / raw)
To: Alexander Korotkov <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, Sep 15, 2025 at 11:50 AM Alexander Korotkov
<[email protected]> wrote:
>
> Hi!
>
> On Tue, Aug 19, 2025 at 12:04 AM Masahiko Sawada <[email protected]> wrote:
> >
> > On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <[email protected]> wrote:
> > >
> > >
> > > On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <[email protected]> wrote:
> > > >
> > >
> > > > 2. when an autovacuum worker (not parallel vacuum worker) who uses
> > > > parallel vacuum gets SIGHUP, it errors out with the error message
> > > > "parameter "max_stack_depth" cannot be set during a parallel
> > > > operation". Autovacuum checks the configuration file reload in
> > > > vacuum_delay_point(), and while reloading the configuration file, it
> > > > attempts to set max_stack_depth in
> > > > InitializeGUCOptionsFromEnvironment() (which is called by
> > > > ProcessConfigFileInternal()). However, it cannot change
> > > > max_stack_depth since the worker is in parallel mode but
> > > > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
> > > > happen in regular backends who are using parallel queries because they
> > > > check the configuration file reload at the end of each SQL command.
> > > >
> > >
> > > Hm, this is a really serious problem. I see only two ways to solve it (both are
> > > not really good) :
> > > 1)
> > > Do not allow processing of the config file during parallel autovacuum
> > > execution.
> > >
> > > 2)
> > > Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
> > > phase. I'm a bit wary about it, because the design says that we should
> > > be in parallel
> > > mode during the whole parallel operation. But actually, if we can make
> > > sure that all
> > > launched workers are exited, I don't see reasons, why can't we just
> > > exit parallel mode
> > > at the end of parallel_vacuum_process_all_indexes.
> > >
> > > What do you think about it?
> >
> > Hmm, given that we're trying to support parallel heap vacuum on
> > another thread[1] and we will probably support it in autovacuums, it
> > seems to me that these approaches won't work.
> >
> > Another idea would be to allow autovacuum workers to process the
> > config file even in parallel mode. GUC changes in the leader worker
> > would not affect parallel vacuum workers, but it is fine to me. In the
> > context of autovacuum, only specific GUC parameters related to
> > cost-based delays need to be affected also to parallel vacuum workers.
> > Probably we need some changes to compute_parallel_delay() so that
> > parallel workers can compute the sleep time based on the new
> > vacuum_cost_limit and vacuum_cost_delay after the leader process
> > (i.e., autovacuum worker) reloads the config file.
> >
> > >
> > > Again, thank you for the review. Please, see v10 patches (only 0001
> > > has been changed) :
> > > 1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
> > > 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
> > > release workers even after an error. This required adding a static
> > > variable to account
> > > for the total number of reserved workers (av_nworkers_reserved).
> > > 3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
> > > autovacuum code. Assign hook has been removed.
> > > 4) Use shmem value for determining the maximum number of parallel autovacuum
> > > workers (eliminate race condition between launcher and leader process).
> >
> > Thank you for updating the patch! I'll review the new version patches.
>
> I've rebased this patchset to the current master. That required me to move the new GUC definition to guc_parameters.dat. Also, I adjusted typedefs.list and made pgindent. Some notes about the patch.
Thank you for updating the patch!
> I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I heard, the overhead of setting/doing jumps is platform-dependent, and not harmless on some platforms. Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly setting a jump in autovacuum workers too.
I wonder if using the TRY/CATCH block is not enough to ensure that
autovacuum workers release the reserved parallel workers in FATAL
cases.
> In general, I think this patchset badly lack of testing. I think it needs tap tests checking from the logs that autovacuum has been done in parallel. Also, it would be good to set up some injection points, and check that reserved autovacuum parallel workers are getting released correctly in the case of errors.
+1
IIUC the patch still has one problem in terms of reloading the
configuration parameters during parallel mode as I mentioned
before[1].
Regards,
[1] https://www.postgresql.org/message-id/CAD21AoBRRXbNJEvCjS-0XZgCEeRBzQPKmrSDjJ3wZ8TN28vaCQ%40mail.gma...
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-10-28 13:09 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-10-28 13:09 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Sep 16, 2025 at 1:50 AM Alexander Korotkov
<[email protected]> wrote:
>
> I've rebased this patchset to the current master.
> That required me to move the new GUC definition to guc_parameters.dat.
> Also, I adjusted typedefs.list and made pgindent.
Thank you for looking into it!
>
> + {
> + {
> + "autovacuum_parallel_workers",
> + "Maximum number of parallel autovacuum workers that can be used for processing this table.",
> + RELOPT_KIND_HEAP,
> + ShareUpdateExclusiveLock
> + },
> + -1, -1, 1024
> + },
>
> Should we use MAX_PARALLEL_WORKER_LIMIT instead of hard-coded 1024 here?
I'm afraid that we will have to include an additional header file to do this.
As far as I know, we are trying not to do so. For now, I will leave it
hardcoded.
>
> - * Support routines for parallel vacuum execution.
> + * Support routines for parallel vacuum and autovacuum execution. In the
> + * future comments, the word "vacuum" will refer to both vacuum and
> + * autovacuum.
>
> Not sure about the usage of word "future" here.
> It doesn't look clear what it means.
> Could we use "below" or "within this file"?
Agree, fixed.
> I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block.
> As I heard, the overhead of setting/doing jumps is platform-dependent, and
> not harmless on some platforms. Therefore, can we skip TRY/CATCH block
> for non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(),
> that would save us from repeatedly setting a jump in autovacuum workers too.
Good idea. I found try/catch block inside the "do_autovacuum" function that is
obviously called only inside the autovacuum. I decided to move ReleaseAllWorkers
call there.
>
> In general, I think this patchset badly lack of testing. I think it needs tap tests
> checking from the logs that autovacuum has been done in parallel. Also, it
> would be good to set up some injection points, and check that reserved
> autovacuum parallel workers are getting released correctly in the case of errors.
Some time ago I tried to write a test, but it looked very ugly. Your
idea with injection points
helped me to write much more reliable tests - see it in a new (v12)
pack of patches.
On Wed, Sep 17, 2025 at 1:31 AM Masahiko Sawada <[email protected]> wrote:
>
> On Mon, Sep 15, 2025 at 11:50 AM Alexander Korotkov
> <[email protected]> wrote:
> >
> > I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I heard,
> > the overhead of setting/doing jumps is platform-dependent, and not harmless on some
> > platforms. Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum?
> > Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly
> > setting a jump in autovacuum workers too.
>
> I wonder if using the TRY/CATCH block is not enough to ensure that
> autovacuum workers release the reserved parallel workers in FATAL
> cases.
>
That's true. I'll register "before_shmem_exit" callback for autovacuum,
which will release workers if there are any reserved and if the a/v workers
exits abnormally.
>
> IIUC the patch still has one problem in terms of reloading the
> configuration parameters during parallel mode as I mentioned
> before[1].
>
Yep. I was happy to see that you think that config file processing is OK for
autovacuum :)
I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
The simplest solution that I see is to move cost-based delay parameters to
shared state (PVShared) and create some variables such a
VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
What do you think about this idea?
Another approaches like a "tell parallel workers that they should
reload config"
looks a bit too invasive IMO.
Thanks everybody for the review! Please, see v12 patches :
1) Implement tests for parallel autovacuum
2) Fix error with unreleased workers - see try/catch block in do_autovacuum
and before_shmem_exit callback registration in AutoVacWorkerMain
3) Allow a/v leader to process config file (see guc.c)
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v12-0004-Documentation-for-parallel-autovacuum.patch (4.4K, 2-v12-0004-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 2afcd438a54849d3fe4e4f4afc230fb2c69c09db Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 28 Oct 2025 15:20:12 +0700
Subject: [PATCH v12 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0a2a8b49fdb..d3ea02cbbe0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9254,6 +9255,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index dc59c88319e..2db34cec0a9 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index a157a244e4e..6eb58c95d9e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v12-0003-Tests-for-parallel-autovacuum.patch (18.4K, 3-v12-0003-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From f923d8f302b5a2d307718a665129e8bef089211c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 28 Oct 2025 15:19:17 +0700
Subject: [PATCH v12 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
9 files changed, 543 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9499d4f0c12..a6358200629 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3437,6 +3437,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
return nreserved;
}
@@ -3467,6 +3474,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..22eaaa7da9d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/workers usage statistics for all of index scans : / .
+ qr/launched in total = 2, planned in total = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..2c979c405bd
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState *inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v12-0002-Logging-for-parallel-autovacuum.patch (7.7K, 4-v12-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 57ea4c318664f6e0b72040d14e7a7d9f82d2036c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 18 Aug 2025 15:14:25 +0700
Subject: [PATCH v12 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d2b031fdd06..d364cde5fe5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1024,6 +1040,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2653,7 +2674,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3085,7 +3107,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 43fe3bcd593..830763eb2fa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2372,6 +2372,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[text/x-patch] v12-0001-Parallel-index-autovacuum.patch (19.6K, 5-v12-0001-Parallel-index-autovacuum.patch)
download | inline diff:
From 2217fc7b293c267ab497c84251dae31c0bfda7e9 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 28 Oct 2025 17:47:13 +0700
Subject: [PATCH v12] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 163 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 239 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 5084af7dfb6..9499d4f0c12 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -763,6 +774,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +792,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1405,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * If parallel autovacuum leader is finishing due to FATAL error, make sure
+ * that all reserved workers are released.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1462,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2515,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that
+ * all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2877,8 +2918,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3360,6 +3405,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3420,6 +3544,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3501,3 +3629,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index a82286cc98a..e7c5982da2a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3387,9 +3387,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only cost-based
+ * delays need to be affected also to parallel vacuum workers, and we will
+ * handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index d6fc8333850..5fbda66b3d4 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2129,6 +2129,15 @@
max => 'MAX_BACKENDS',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'max_parallel_maintenance_workers', type => 'int', context => 'PGC_USERSET', group => 'RESOURCES_WORKER_PROCESSES',
short_desc => 'Sets the maximum number of parallel processes per maintenance operation.',
variable => 'max_parallel_maintenance_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f62b61967ef..b3e471ed33e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 36ea6a4d557..d89da606920 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1412,6 +1412,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..f4b93b44531 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-10-31 07:54 Daniil Davydov <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2025-10-31 07:54 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Oct 28, 2025 at 8:09 PM Daniil Davydov <[email protected]> wrote:
>
> Thanks everybody for the review! Please, see v12 patches :
> 1) Implement tests for parallel autovacuum
I forgot to add a new directory to Makefile and meson.build files.
Fixed in v13 patches (only 0003 has changed).
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v13-0002-Logging-for-parallel-autovacuum.patch (7.7K, 2-v13-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From d1544aaad4206687afe730a43e818f16a4f67710 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 31 Oct 2025 14:42:46 +0700
Subject: [PATCH v13 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d2b031fdd06..d364cde5fe5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1024,6 +1040,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2653,7 +2674,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3085,7 +3107,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 018b5919cf6..2c73faa30e7 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2372,6 +2372,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[text/x-patch] v13-0004-Documentation-for-parallel-autovacuum.patch (4.4K, 3-v13-0004-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 01ecdb2e6ebc57ddd7f343d135617a92ba0ebf73 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 31 Oct 2025 14:44:35 +0700
Subject: [PATCH v13 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0a2a8b49fdb..d3ea02cbbe0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9254,6 +9255,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index dc59c88319e..2db34cec0a9 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index a157a244e4e..6eb58c95d9e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v13-0001-Parallel-autovacuum.patch (19.6K, 4-v13-0001-Parallel-autovacuum.patch)
download | inline diff:
From 72bfe3c48b7c445038cc8c83e3b9fd5ad72e27d2 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 31 Oct 2025 14:42:27 +0700
Subject: [PATCH v13 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 163 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 239 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 5084af7dfb6..9499d4f0c12 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -763,6 +774,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +792,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1405,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * If parallel autovacuum leader is finishing due to FATAL error, make sure
+ * that all reserved workers are released.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1462,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2515,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that
+ * all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2877,8 +2918,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3360,6 +3405,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3420,6 +3544,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3501,3 +3629,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 679846da42c..d1d796a1b18 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3315,9 +3315,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only cost-based
+ * delays need to be affected also to parallel vacuum workers, and we will
+ * handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index d6fc8333850..5fbda66b3d4 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2129,6 +2129,15 @@
max => 'MAX_BACKENDS',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'max_parallel_maintenance_workers', type => 'int', context => 'PGC_USERSET', group => 'RESOURCES_WORKER_PROCESSES',
short_desc => 'Sets the maximum number of parallel processes per maintenance operation.',
variable => 'max_parallel_maintenance_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f62b61967ef..b3e471ed33e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 36ea6a4d557..d89da606920 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1412,6 +1412,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..f4b93b44531 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v13-0003-Tests-for-parallel-autovacuum.patch (19.2K, 5-v13-0003-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 534ed530a7fd11cad34f6c39678b9937971b80b0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 31 Oct 2025 14:44:12 +0700
Subject: [PATCH v13 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9499d4f0c12..a6358200629 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3437,6 +3437,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
return nreserved;
}
@@ -3467,6 +3474,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..22eaaa7da9d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/workers usage statistics for all of index scans : / .
+ qr/launched in total = 2, planned in total = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..2c979c405bd
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState *inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-10-31 20:03 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2025-10-31 20:03 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Tue, Oct 28, 2025 at 6:10 AM Daniil Davydov <[email protected]> wrote:
>
> >
> > IIUC the patch still has one problem in terms of reloading the
> > configuration parameters during parallel mode as I mentioned
> > before[1].
> >
>
> Yep. I was happy to see that you think that config file processing is OK for
> autovacuum :)
> I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
> The simplest solution that I see is to move cost-based delay parameters to
> shared state (PVShared) and create some variables such a
> VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
> What do you think about this idea?
I think that we need to somehow have parallel workers use the new
vacuum delay parameters (e.g., VacuumCostPageHit and
VacuumCostPageMiss) after the leader reloads the configuration file.
The leader shares the initial parameters with the parallel workers
(via DSM) before starting the workers but doesn't propagate the
updates during the parallel operations. And the worker doesn't reload
the configuration file.
>
> Another approaches like a "tell parallel workers that they should
> reload config"
> looks a bit too invasive IMO.
>
>
> Thanks everybody for the review! Please, see v12 patches :
> 1) Implement tests for parallel autovacuum
> 2) Fix error with unreleased workers - see try/catch block in do_autovacuum
> and before_shmem_exit callback registration in AutoVacWorkerMain
> 3) Allow a/v leader to process config file (see guc.c)
>
Here are some review comments for 0001 patch:
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
AutoVacuumReleaseAllParallelWorkers() calls
AutoVacuumReleaseParallelWorkers() only when av_nworkers_reserved > 0,
so I think we don't need the condition 'if (code != 0)' here.
---
+extern void AutoVacuumReleaseAllParallelWorkers(void);
There is no caller of this function outside of autovacuum.h.
---
{ name => 'autovacuum_max_parallel_workers', type => 'int', context =>
'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel autovacuum workers, that
can be taken from bgworkers pool.',
long_desc => 'This parameter is capped by "max_worker_processes"
(not by "autovacuum_max_workers"!).',
variable => 'autovacuum_max_parallel_workers',
boot_val => '0',
min => '0',
max => 'MAX_BACKENDS',
},
Parallel vacuum in autovacuum can be used only when users set the
autovacuum_parallel_workers storage parameter. How about using the
default value 2 for autovacuum_max_parallel_workers GUC parameter?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-11-20 19:31 Sami Imseih <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Sami Imseih @ 2025-11-20 19:31 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
I started to review this patch set again, and it needed rebasing, so I
went ahead and did that.
I also have some comments:
#1
In AutoVacuumReserveParallelWorkers()
I think here we should assert:
```
Assert(nworkers <= AutoVacuumShmem->av_freeParallelWorkers);
```
prior to:
```
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
```
We are capping nworkers earlier in parallel_vacuum_compute_workers()
```
/* Cap by GUC variable */
parallel_workers = Min(parallel_workers, max_workers);
```
so the assert will safe-guard against someone making a faulty change
in parallel_vacuum_compute_workers()
#2
In
parallel_vacuum_process_all_indexes()
```
+ /*
+ * Reserve workers in autovacuum global state. Note, that we
may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
```
nworkers has a double meaning. The return value of
AutoVacuumReserveParallelWorkers
is nreserved. I think this should be
```
nreserved = AutoVacuumReserveParallelWorkers(nworkers);
```
and nreserved becomes the authoritative value for the number of parallel
workers after that point.
#3
I noticed in the logging:
```
2025-11-20 18:44:09.252 UTC [36787] LOG: automatic vacuum of table
"test.public.t": index scans: 0
workers usage statistics for all of index scans : launched in
total = 3, planned in total = 3
pages: 0 removed, 503306 remain, 14442 scanned (2.87% of
total), 0 eagerly scanned
tuples: 101622 removed, 7557074 remain, 0 are dead but not yet removable
removable cutoff: 1711, which was 1 XIDs old when operation ended
frozen: 4793 pages from table (0.95% of total) had 98303 tuples frozen
visibility map: 4822 pages set all-visible, 4745 pages set
all-frozen (0 were all-visible)
index scan bypassed: 8884 pages from table (1.77% of total)
have 195512 dead item identifiers
```
that even though index scan was bypased, we still launched parallel
workers. I didn't dig deep into this,
but that looks wrong. what do you think?
#4
instead of:
"workers usage statistics for all of index scans : launched in total =
0, planned in total = 0"
how about:
"parallel index scan : workers planned = 0, workers launched = 0"
also log this after the "index scan needed:" line; so it looks like
this. What do you think>
```
index scan needed: 13211 pages from table (2.63% of total) had
289482 dead item identifiers removed
parallel index scan : workers planned = 0, workers launched = 0
index "t_pkey": pages: 25234 in total, 0 newly deleted, 0 currently
deleted, 0 reusable
index "t_c1_idx": pages: 10219 in total, 0 newly deleted, 0
currently deleted, 0 reusable
```
--
Sami Imseih
Amazon Web Services (AWS)
Attachments:
[application/octet-stream] v14-0004-Documentation-for-parallel-autovacuum.patch (4.4K, 2-v14-0004-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 1302f966053c30c89c9365b48bff793844053d28 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 31 Oct 2025 14:44:35 +0700
Subject: [PATCH v14 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 023b3f03ba9..0f7096c2b5f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2841,6 +2841,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9264,6 +9265,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index f4f0433ef6f..02f306bbb8a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 6557c5cffd8..e95a6488c5e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[application/octet-stream] v14-0003-Tests-for-parallel-autovacuum.patch (19.2K, 3-v14-0003-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 3a7cacb4e7ded37dac5eb54e7614942e9684f690 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 31 Oct 2025 14:44:12 +0700
Subject: [PATCH v14 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 2b6ceedf987..3a8a617fc63 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3435,6 +3435,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
return nreserved;
}
@@ -3465,6 +3472,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..22eaaa7da9d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/workers usage statistics for all of index scans : / .
+ qr/launched in total = 2, planned in total = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[application/octet-stream] v14-0002-Logging-for-parallel-autovacuum.patch (7.7K, 4-v14-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 45439d8d5e5da1ca9f10fddfd958943a2abae08c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 31 Oct 2025 14:42:46 +0700
Subject: [PATCH v14 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 7a6d6f42634..59438c18b10 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1024,6 +1040,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2655,7 +2676,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3087,7 +3109,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c751c25a04d..fbff437d104 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[application/octet-stream] v14-0001-Parallel-autovacuum.patch (19.6K, 5-v14-0001-Parallel-autovacuum.patch)
download | inline diff:
From 74d2c076f5dda0bb135107b1e09511a04137c125 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 31 Oct 2025 14:42:27 +0700
Subject: [PATCH v14 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 163 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 239 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1c38488f2cb..2b6ceedf987 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -763,6 +774,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +792,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1405,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * If parallel autovacuum leader is finishing due to FATAL error, make sure
+ * that all reserved workers are released.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1462,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2515,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2921,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3403,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3542,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3627,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c6484aea087..2a037485d5e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..de9f1bd4808 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..559ef7b1771 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 51806597037..6170436b341 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..605d0829b03 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..f4b93b44531 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-11-22 20:13 Daniil Davydov <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-11-22 20:13 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Sat, Nov 1, 2025 at 3:03 AM Masahiko Sawada <[email protected]> wrote:
>
> On Tue, Oct 28, 2025 at 6:10 AM Daniil Davydov <[email protected]> wrote:
> >
> > I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
> > The simplest solution that I see is to move cost-based delay parameters to
> > shared state (PVShared) and create some variables such a
> > VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
> > What do you think about this idea?
>
> I think that we need to somehow have parallel workers use the new
> vacuum delay parameters (e.g., VacuumCostPageHit and
> VacuumCostPageMiss) after the leader reloads the configuration file.
> The leader shares the initial parameters with the parallel workers
> (via DSM) before starting the workers but doesn't propagate the
> updates during the parallel operations. And the worker doesn't reload
> the configuration file.
I'm still working on it.
> Here are some review comments for 0001 patch:
>
> +static void
> +autovacuum_worker_before_shmem_exit(int code, Datum arg)
> +{
> + if (code != 0)
> + AutoVacuumReleaseAllParallelWorkers();
> +}
> +
>
> AutoVacuumReleaseAllParallelWorkers() calls
> AutoVacuumReleaseParallelWorkers() only when av_nworkers_reserved > 0,
> so I think we don't need the condition 'if (code != 0)' here.
Yeah, I wrote it more like a hint for the reader - "we should call
this function only
if the process is exiting due to an error". But actually it is not
necessary condition.
>
> ---
> +extern void AutoVacuumReleaseAllParallelWorkers(void);
>
> There is no caller of this function outside of autovacuum.h.
>
I will fix it.
> ---
> { name => 'autovacuum_max_parallel_workers', type => 'int', context =>
> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> short_desc => 'Maximum number of parallel autovacuum workers, that
> can be taken from bgworkers pool.',
> long_desc => 'This parameter is capped by "max_worker_processes"
> (not by "autovacuum_max_workers"!).',
> variable => 'autovacuum_max_parallel_workers',
> boot_val => '0',
> min => '0',
> max => 'MAX_BACKENDS',
> },
>
> Parallel vacuum in autovacuum can be used only when users set the
> autovacuum_parallel_workers storage parameter. How about using the
> default value 2 for autovacuum_max_parallel_workers GUC parameter?
>
Sounds reasonable, +1 for it.
On Fri, Nov 21, 2025 at 2:31 AM Sami Imseih <[email protected]> wrote:
>
> Hi,
>
> I started to review this patch set again, and it needed rebasing, so I
> went ahead and did that.
Thanks for the review and rebasing the patch!
>
> I also have some comments:
>
> #1
> In AutoVacuumReserveParallelWorkers()
> I think here we should assert:
>
> ```
> Assert(nworkers <= AutoVacuumShmem->av_freeParallelWorkers);
> ```
> prior to:
> ```
> + AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
> ```
>
> We are capping nworkers earlier in parallel_vacuum_compute_workers()
>
> ```
> /* Cap by GUC variable */
> parallel_workers = Min(parallel_workers, max_workers);
> ```
>
> so the assert will safe-guard against someone making a faulty change
> in parallel_vacuum_compute_workers()
>
Hm, I guess it is just a bug. We should reduce av_freeParallelWorkers by the
computed 'nreserved/ value (thus, we don't need any assertion). I'll fix it.
>
> #2
> In
> parallel_vacuum_process_all_indexes()
>
> ```
> + /*
> + * Reserve workers in autovacuum global state. Note, that we
> may be given
> + * fewer workers than we requested.
> + */
> + if (AmAutoVacuumWorkerProcess() && nworkers > 0)
> + nworkers = AutoVacuumReserveParallelWorkers(nworkers);
> ```
>
> nworkers has a double meaning. The return value of
> AutoVacuumReserveParallelWorkers
> is nreserved. I think this should be
>
> ```
> nreserved = AutoVacuumReserveParallelWorkers(nworkers);
> ```
>
> and nreserved becomes the authoritative value for the number of parallel
> workers after that point.
Reserving parallel workers is specific for an autovacuum. If we add
'nreserved' variable, we would have to change all conditions below in
order not to break maintenance parallel vacuum. I think it will be confusing :
***
if (nworkers > 0 || (AmAutoVacuumWorkerProcess() && nreserved > 0))
***
Moreover, 'nworkers' reflects how many workers will be involved in vacuuming,
and I think that capping it by 'nreserved' is not breaking this semantic.
>
> #3
> I noticed in the logging:
>
> ```
> 2025-11-20 18:44:09.252 UTC [36787] LOG: automatic vacuum of table
> "test.public.t": index scans: 0
> workers usage statistics for all of index scans : launched in
> total = 3, planned in total = 3
> pages: 0 removed, 503306 remain, 14442 scanned (2.87% of
> total), 0 eagerly scanned
> tuples: 101622 removed, 7557074 remain, 0 are dead but not yet removable
> removable cutoff: 1711, which was 1 XIDs old when operation ended
> frozen: 4793 pages from table (0.95% of total) had 98303 tuples frozen
> visibility map: 4822 pages set all-visible, 4745 pages set
> all-frozen (0 were all-visible)
> index scan bypassed: 8884 pages from table (1.77% of total)
> have 195512 dead item identifiers
> ```
>
> that even though index scan was bypased, we still launched parallel
> workers. I didn't dig deep into this,
> but that looks wrong. what do you think?
>
We can do both index vacuuming and index cleanup in parallel. I guess that
in your situation the vacuum was bypassed, but cleanup was called.
> #4
> instead of:
>
> "workers usage statistics for all of index scans : launched in total =
> 0, planned in total = 0"
>
> how about:
>
> "parallel index scan : workers planned = 0, workers launched = 0"
>
> also log this after the "index scan needed:" line; so it looks like
> this. What do you think>
>
> ```
> index scan needed: 13211 pages from table (2.63% of total) had
> 289482 dead item identifiers removed
> parallel index scan : workers planned = 0, workers launched = 0
> index "t_pkey": pages: 25234 in total, 0 newly deleted, 0 currently
> deleted, 0 reusable
> index "t_c1_idx": pages: 10219 in total, 0 newly deleted, 0
> currently deleted, 0 reusable
> ```
Agree, it looks better.
Thanks everybody for the comments!
Please, see v15 patches.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v15-0004-Documentation-for-parallel-autovacuum.patch (4.4K, 2-v15-0004-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From a867a0ffb18549b493412d6bc079df6aef9b92a4 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v15 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 023b3f03ba9..0f7096c2b5f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2841,6 +2841,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9264,6 +9265,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index f4f0433ef6f..02f306bbb8a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 6557c5cffd8..e95a6488c5e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v15-0002-Logging-for-parallel-autovacuum.patch (7.7K, 3-v15-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From d10e3e0edd1f17ceabe8b12f780827ae0c9b686d Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v15 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 65bb0568a86..ea7a18d4d51 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1099,6 +1115,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2659,7 +2680,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3091,7 +3113,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 27a4d131897..a838b0885c6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[text/x-patch] v15-0003-Tests-for-parallel-autovacuum.patch (19.2K, 4-v15-0003-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 267641b1832f011a32b8f870dd1794d0a82f0a7f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v15 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e6a4aa99eae..37c8d268903 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3436,6 +3436,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += nreserved;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
return nreserved;
}
@@ -3466,6 +3473,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..1271768ebd2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+ qr/workers launched = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v15-0001-Parallel-autovacuum.patch (19.7K, 5-v15-0001-Parallel-autovacuum.patch)
download | inline diff:
From 6c6806211a364519150138be6aff9f749e708252 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v15 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 240 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1c38488f2cb..e6a4aa99eae 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1406,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, if parallel autovacuum
+ * leader is finishing due to FATAL error. Otherwise function have no effect.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1463,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2516,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2922,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3404,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nreserved;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nreserved;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3543,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3628,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c6484aea087..2a037485d5e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..6c38275d30b 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 51806597037..6170436b341 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..605d0829b03 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..23cb531c68c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-11-22 22:51 Sami Imseih <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Sami Imseih @ 2025-11-22 22:51 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
> > nworkers has a double meaning. The return value of
> > AutoVacuumReserveParallelWorkers
> > is nreserved. I think this should be
> >
> > ```
> > nreserved = AutoVacuumReserveParallelWorkers(nworkers);
> > ```
> >
> > and nreserved becomes the authoritative value for the number of parallel
> > workers after that point.
I could not find this pattern being used in the code base.
I think it will be better and more in-line without what we generally do
and pass-by-reference and update the value inside
AutoVacuumReserveParallelWorkers:
```
AutoVacuumReserveParallelWorkers(&nworkers).
```
Maybe that's just my preference.
>> ---
>> { name => 'autovacuum_max_parallel_workers', type => 'int', context =>
>> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
>> short_desc => 'Maximum number of parallel autovacuum workers, that
>> can be taken from bgworkers pool.',
>> long_desc => 'This parameter is capped by "max_worker_processes"
>> (not by "autovacuum_max_workers"!).',
>> variable => 'autovacuum_max_parallel_workers',
>> boot_val => '0',
>> min => '0',
>> max => 'MAX_BACKENDS',
>> },
>>
>> Parallel vacuum in autovacuum can be used only when users set the
>> autovacuum_parallel_workers storage parameter. How about using the
>> default value 2 for autovacuum_max_parallel_workers GUC parameter?
> Sounds reasonable, +1 for it.
v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
It should now be 2.
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
--
Sami
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2025-11-23 15:02 Daniil Davydov <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2025-11-23 15:02 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Sun, Nov 23, 2025 at 5:51 AM Sami Imseih <[email protected]> wrote:
>
> > > nworkers has a double meaning. The return value of
> > > AutoVacuumReserveParallelWorkers
> > > is nreserved. I think this should be
> > >
> > > ```
> > > nreserved = AutoVacuumReserveParallelWorkers(nworkers);
> > > ```
> > >
> > > and nreserved becomes the authoritative value for the number of parallel
> > > workers after that point.
>
> I could not find this pattern being used in the code base.
> I think it will be better and more in-line without what we generally do
> and pass-by-reference and update the value inside
> AutoVacuumReserveParallelWorkers:
>
> ```
> AutoVacuumReserveParallelWorkers(&nworkers).
> ```
Maybe I just don't like functions with side effects, but this function will
have ones anyway. I'll add logic with passing by reference as you
suggested.
>
> >> ---
> >> { name => 'autovacuum_max_parallel_workers', type => 'int', context =>
> >> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> >> short_desc => 'Maximum number of parallel autovacuum workers, that
> >> can be taken from bgworkers pool.',
> >> long_desc => 'This parameter is capped by "max_worker_processes"
> >> (not by "autovacuum_max_workers"!).',
> >> variable => 'autovacuum_max_parallel_workers',
> >> boot_val => '0',
> >> min => '0',
> >> max => 'MAX_BACKENDS',
> >> },
> >>
> >> Parallel vacuum in autovacuum can be used only when users set the
> >> autovacuum_parallel_workers storage parameter. How about using the
> >> default value 2 for autovacuum_max_parallel_workers GUC parameter?
>
> > Sounds reasonable, +1 for it.
>
> v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
> It should now be 2.
>
> + Sets the maximum number of parallel autovacuum workers that
> + can be used for parallel index vacuuming at one time. Is capped by
> + <xref linkend="guc-max-worker-processes"/>. The default is 0,
> + which means no parallel index vacuuming.
Thanks for noticing it! Fixed.
I am sending an updated set of patches.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v16-0003-Tests-for-parallel-autovacuum.patch (19.2K, 2-v16-0003-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 9ecc7800596c79b5f1234e2b8453aac42321c1fd Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v16 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ec7b7170be5..bf22fe2d00c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ceca03bcf34..ce15985cf5d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3437,6 +3437,13 @@ AutoVacuumReserveParallelWorkers(int *nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += *nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
@@ -3466,6 +3473,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..1271768ebd2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+ qr/workers launched = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v16-0004-Documentation-for-parallel-autovacuum.patch (4.4K, 3-v16-0004-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 5a7bda14d60af3dcd7072dd34dba637e19de7aba Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v16 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 023b3f03ba9..85db09df897 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2841,6 +2841,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9264,6 +9265,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index f4f0433ef6f..02f306bbb8a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 6557c5cffd8..e95a6488c5e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v16-0001-Parallel-autovacuum.patch (19.8K, 4-v16-0001-Parallel-autovacuum.patch)
download | inline diff:
From 304f92f13dcbf90ddbfbdd95859c772fdfaadbb3 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v16 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 240 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..6e2c22be2ee 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1c38488f2cb..ceca03bcf34 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1406,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, if parallel autovacuum
+ * leader is finishing due to FATAL error. Otherwise function have no effect.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1463,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2516,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2922,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3404,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function during computing the parallel
+ * degree.
+ *
+ * 'nworkers' is the desired number of parallel workers to reserve. Function
+ * sets 'nworkers' to the number of parallel workers that actually can be
+ * launched and reserves these workers (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers= Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3543,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3628,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c6484aea087..2a037485d5e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..6c38275d30b 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 51806597037..6170436b341 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..605d0829b03 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..9d558c9c056 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v16-0002-Logging-for-parallel-autovacuum.patch (7.7K, 5-v16-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 5cacc91030063a1514e74e9af864868014df1658 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v16 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 65bb0568a86..ea7a18d4d51 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1099,6 +1115,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2659,7 +2680,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3091,7 +3113,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 6e2c22be2ee..ec7b7170be5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 27a4d131897..a838b0885c6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-05 18:51 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-01-05 18:51 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sun, Nov 23, 2025 at 7:02 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Sun, Nov 23, 2025 at 5:51 AM Sami Imseih <[email protected]> wrote:
> >
> > > > nworkers has a double meaning. The return value of
> > > > AutoVacuumReserveParallelWorkers
> > > > is nreserved. I think this should be
> > > >
> > > > ```
> > > > nreserved = AutoVacuumReserveParallelWorkers(nworkers);
> > > > ```
> > > >
> > > > and nreserved becomes the authoritative value for the number of parallel
> > > > workers after that point.
> >
> > I could not find this pattern being used in the code base.
> > I think it will be better and more in-line without what we generally do
> > and pass-by-reference and update the value inside
> > AutoVacuumReserveParallelWorkers:
> >
> > ```
> > AutoVacuumReserveParallelWorkers(&nworkers).
> > ```
>
> Maybe I just don't like functions with side effects, but this function will
> have ones anyway. I'll add logic with passing by reference as you
> suggested.
>
> >
> > >> ---
> > >> { name => 'autovacuum_max_parallel_workers', type => 'int', context =>
> > >> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> > >> short_desc => 'Maximum number of parallel autovacuum workers, that
> > >> can be taken from bgworkers pool.',
> > >> long_desc => 'This parameter is capped by "max_worker_processes"
> > >> (not by "autovacuum_max_workers"!).',
> > >> variable => 'autovacuum_max_parallel_workers',
> > >> boot_val => '0',
> > >> min => '0',
> > >> max => 'MAX_BACKENDS',
> > >> },
> > >>
> > >> Parallel vacuum in autovacuum can be used only when users set the
> > >> autovacuum_parallel_workers storage parameter. How about using the
> > >> default value 2 for autovacuum_max_parallel_workers GUC parameter?
> >
> > > Sounds reasonable, +1 for it.
> >
> > v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
> > It should now be 2.
> >
> > + Sets the maximum number of parallel autovacuum workers that
> > + can be used for parallel index vacuuming at one time. Is capped by
> > + <xref linkend="guc-max-worker-processes"/>. The default is 0,
> > + which means no parallel index vacuuming.
>
> Thanks for noticing it! Fixed.
>
> I am sending an updated set of patches.
Thank you for updating the patch! I've reviewed the 0001 patch and
here are some comments:
---
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += *nworkers;
I think we can simply assign *nworkers to av_nworkers_reserved instead
of incrementing it as we're sure that av_nworkers_reserved is 0 at the
beginning of this function.
---
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
We can put an assertion at the end of the function to verify that this
worker doesn't reserve any worker.
---
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
I think it would be more future-proof if we call
AutoVacuumReleaseAllParallelWorkers() regardless of the code if there
is no strong reason why we check the code there.
---
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
*/
InitProcess();
I think it's better to register the
autovacuum_worker_before_shmem_exit() after the process
initialization. The function could use LWLocks to release the reserved
workers. Given that AutoVacuumReleaseAllParallelWorkers() doesn't try
to release the reserved worker when av_nworkers_reserved == 0, but it
would be more future-proof to do that after the basic process
initialization processes.
How about renaming autovacuum_worker_before_shmem_exit() to
autovacuum_worker_onexit()?
---
IIUC the patch needs to implement some logic to propagate the updates
of vacuum delay parameters to parallel vacuum workers. Are you still
working on it? Or shall I draft this part on top of the 0001 patch?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-05 20:44 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-01-05 20:44 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Jan 6, 2026 at 1:51 AM Masahiko Sawada <[email protected]> wrote:
>
> On Sun, Nov 23, 2025 at 7:02 AM Daniil Davydov <[email protected]> wrote:
> >
> > Hi,
> >
> > On Sun, Nov 23, 2025 at 5:51 AM Sami Imseih <[email protected]> wrote:
> > >
> > > > > nworkers has a double meaning. The return value of
> > > > > AutoVacuumReserveParallelWorkers
> > > > > is nreserved. I think this should be
> > > > >
> > > > > ```
> > > > > nreserved = AutoVacuumReserveParallelWorkers(nworkers);
> > > > > ```
> > > > >
> > > > > and nreserved becomes the authoritative value for the number of parallel
> > > > > workers after that point.
> > >
> > > I could not find this pattern being used in the code base.
> > > I think it will be better and more in-line without what we generally do
> > > and pass-by-reference and update the value inside
> > > AutoVacuumReserveParallelWorkers:
> > >
> > > ```
> > > AutoVacuumReserveParallelWorkers(&nworkers).
> > > ```
> >
> > Maybe I just don't like functions with side effects, but this function will
> > have ones anyway. I'll add logic with passing by reference as you
> > suggested.
> >
> > >
> > > >> ---
> > > >> { name => 'autovacuum_max_parallel_workers', type => 'int', context =>
> > > >> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> > > >> short_desc => 'Maximum number of parallel autovacuum workers, that
> > > >> can be taken from bgworkers pool.',
> > > >> long_desc => 'This parameter is capped by "max_worker_processes"
> > > >> (not by "autovacuum_max_workers"!).',
> > > >> variable => 'autovacuum_max_parallel_workers',
> > > >> boot_val => '0',
> > > >> min => '0',
> > > >> max => 'MAX_BACKENDS',
> > > >> },
> > > >>
> > > >> Parallel vacuum in autovacuum can be used only when users set the
> > > >> autovacuum_parallel_workers storage parameter. How about using the
> > > >> default value 2 for autovacuum_max_parallel_workers GUC parameter?
> > >
> > > > Sounds reasonable, +1 for it.
> > >
> > > v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
> > > It should now be 2.
> > >
> > > + Sets the maximum number of parallel autovacuum workers that
> > > + can be used for parallel index vacuuming at one time. Is capped by
> > > + <xref linkend="guc-max-worker-processes"/>. The default is 0,
> > > + which means no parallel index vacuuming.
> >
> > Thanks for noticing it! Fixed.
> >
> > I am sending an updated set of patches.
>
> Thank you for updating the patch! I've reviewed the 0001 patch and
> here are some comments:
Thank you for the review!
>
> ---
> + /* Remember how many workers we have reserved. */
> + av_nworkers_reserved += *nworkers;
>
> I think we can simply assign *nworkers to av_nworkers_reserved instead
> of incrementing it as we're sure that av_nworkers_reserved is 0 at the
> beginning of this function.
Agree, it will be more clear.
>
> ---
> +static void
> +AutoVacuumReleaseAllParallelWorkers(void)
> +{
> + /* Only leader worker can call this function. */
> + Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
> +
> + if (av_nworkers_reserved > 0)
> + AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
> +}
>
> We can put an assertion at the end of the function to verify that this
> worker doesn't reserve any worker.
It's not a problem to add this assertion, but I have doubts : we have a
function that promises to release a given number of workers, but we are
still checking whether a specified number of workers have been released.
I suggest another place for assertion - see comment below.
>
> ---
> +static void
> +autovacuum_worker_before_shmem_exit(int code, Datum arg)
> +{
> + if (code != 0)
> + AutoVacuumReleaseAllParallelWorkers();
> +}
>
> I think it would be more future-proof if we call
> AutoVacuumReleaseAllParallelWorkers() regardless of the code if there
> is no strong reason why we check the code there.
I think we can leave "code != 0" so as not to confuse the readers, but
add the assertion that at the end of the function all workers have been
released. Thus, we are telling that 1) in normal processing we must not
have reserved workers and 2) even after a FATAL error we are sure
that we don't have reserved workers.
>
> ---
> + before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
>
> /*
> * Create a per-backend PGPROC struct in shared memory. We must do this
> * before we can use LWLocks or access any shared memory.
> */
> InitProcess();
>
> I think it's better to register the
> autovacuum_worker_before_shmem_exit() after the process
> initialization. The function could use LWLocks to release the reserved
> workers. Given that AutoVacuumReleaseAllParallelWorkers() doesn't try
> to release the reserved worker when av_nworkers_reserved == 0, but it
> would be more future-proof to do that after the basic process
> initialization processes.
My bad, I miss the comment above InitProcess. Agree with you.
Just in case, callback registration will be invoked after BaseInit.
>
> How about renaming autovacuum_worker_before_shmem_exit() to
> autovacuum_worker_onexit()?
We also have "on_shmem_exit" callbacks. Maybe "onexit" naming can confuse
somebody?..
Since the function name does not cross line length boundary anywhere, I suggest
leaving the current naming.
> ---
> IIUC the patch needs to implement some logic to propagate the updates
> of vacuum delay parameters to parallel vacuum workers.
Yep.
> Are you still working on it? Or shall I draft this part on top of the
> 0001 patch?
I thought about some "beautiful" approach, but for now I have
only one idea - force parallel a/v workers to get values for these
parameters from shmem (which obviously can be modified by the
leader a/v process). I'll post this patch in the near future.
Please, see v17 patches (only 0001 has been changed).
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v17-0002-Logging-for-parallel-autovacuum.patch (7.7K, 2-v17-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 0aafa271ec90dbe494eea79fd484a4856023b3a8 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v17 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2086a577199..35d2b07aa8a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -349,6 +349,12 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -711,6 +717,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc_array(char *, vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1125,6 +1141,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2700,7 +2721,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3133,7 +3155,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 6a3a00585f9..490f93959d1 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..ec5d70aacdc 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b9e671fcda8..6e35c6aa493 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2397,6 +2397,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[text/x-patch] v17-0004-Documentation-for-parallel-autovacuum.patch (4.4K, 3-v17-0004-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 6f615b06b1578b5c72b36074de20811999b52e4f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v17 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 601aa3afb8e..36fcc72f325 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2847,6 +2847,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9282,6 +9283,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 77c5a763d45..3592c9acff9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v17-0003-Tests-for-parallel-autovacuum.patch (19.2K, 4-v17-0003-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From ca52efb09b02d21f6e35dabed2b9563d851151a0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v17 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 490f93959d1..c2f0a37eef2 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index bc11970bfee..a27274bfb4d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3439,6 +3439,13 @@ AutoVacuumReserveParallelWorkers(int *nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved = *nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
@@ -3468,6 +3475,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4c6d56d97d8..bfe365fa575 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1b31c5b98d6..01a3e3ec044 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..1271768ebd2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+ qr/workers launched = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v17-0001-Parallel-autovacuum.patch (19.7K, 5-v17-0001-Parallel-autovacuum.patch)
download | inline diff:
From a5f261dc7b4fe37aba8f24ef5241e2b1f2d85a36 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v17 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 166 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 242 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0b83f98ed5f..692ac46733e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c3b3c9ea21a..6a3a00585f9 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3e507d23cc9..bc11970bfee 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1406,19 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1438,6 +1474,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
/* Early initialization */
BaseInit();
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* If an exception is encountered, processing resumes here.
*
@@ -2480,6 +2518,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2924,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3406,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function during computing the parallel
+ * degree.
+ *
+ * 'nworkers' is the desired number of parallel workers to reserve. Function
+ * sets 'nworkers' to the number of parallel workers that actually can be
+ * launched and reserves these workers (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers= Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3545,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3630,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae9d5f3fb70..c8a99a67767 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 7c60b125564..e933f5048f7 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 06edea98f06..2b8a4aab390 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index db559b39c4d..ad6e19f426c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e43067d0260..4acadbc0610 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d03ab247788..c1d882659f9 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-07 09:51 Daniil Davydov <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Daniil Davydov @ 2026-01-07 09:51 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Jan 6, 2026 at 3:44 AM Daniil Davydov <[email protected]> wrote:
>
> On Tue, Jan 6, 2026 at 1:51 AM Masahiko Sawada <[email protected]> wrote:
> >
> > Are you still working on it? Or shall I draft this part on top of the
> > 0001 patch?
>
> I thought about some "beautiful" approach, but for now I have
> only one idea - force parallel a/v workers to get values for these
> parameters from shmem (which obviously can be modified by the
> leader a/v process). I'll post this patch in the near future.
>
I am posting a draft version of the patch (see 0005) that allows parallel
leader to propagate changes of cost-based parameters to its parallel
workers. It is a very rough fix, but it reflects my idea that is to have some
shared state from which parallel workers can get values for the parameters
(and which only leader worker can modify, obviously).
I have also added a test that shows that this idea is working - the test
ensures that parallel workers can change its parameters if they have been
changed for the leader worker.
Note that so far the work is in progress - this logic works only for
vacuum_cost_delay and vacuum_cost_limits parameters. I think that we
should agree on an idea first, and only then apply logic for all appropriate
parameters.
What do you think?
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v18-0001-Parallel-autovacuum.patch (19.7K, 2-v18-0001-Parallel-autovacuum.patch)
download | inline diff:
From a5f261dc7b4fe37aba8f24ef5241e2b1f2d85a36 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v18 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 166 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 242 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0b83f98ed5f..692ac46733e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c3b3c9ea21a..6a3a00585f9 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3e507d23cc9..bc11970bfee 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1406,19 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1438,6 +1474,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
/* Early initialization */
BaseInit();
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* If an exception is encountered, processing resumes here.
*
@@ -2480,6 +2518,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2924,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3406,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function during computing the parallel
+ * degree.
+ *
+ * 'nworkers' is the desired number of parallel workers to reserve. Function
+ * sets 'nworkers' to the number of parallel workers that actually can be
+ * launched and reserves these workers (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers= Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3545,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3630,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae9d5f3fb70..c8a99a67767 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 7c60b125564..e933f5048f7 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 06edea98f06..2b8a4aab390 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index db559b39c4d..ad6e19f426c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e43067d0260..4acadbc0610 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d03ab247788..c1d882659f9 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v18-0004-Documentation-for-parallel-autovacuum.patch (4.4K, 3-v18-0004-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From bbcc4b92941325248254b074a2d1c94f244b6a6c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v18 4/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 601aa3afb8e..36fcc72f325 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2847,6 +2847,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9282,6 +9283,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 77c5a763d45..3592c9acff9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v18-0002-Logging-for-parallel-autovacuum.patch (7.7K, 4-v18-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 0aafa271ec90dbe494eea79fd484a4856023b3a8 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v18 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2086a577199..35d2b07aa8a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -349,6 +349,12 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -711,6 +717,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc_array(char *, vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1125,6 +1141,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2700,7 +2721,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3133,7 +3155,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 6a3a00585f9..490f93959d1 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..ec5d70aacdc 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b9e671fcda8..6e35c6aa493 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2397,6 +2397,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[text/x-patch] v18-0003-Tests-for-parallel-autovacuum.patch (19.3K, 5-v18-0003-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 29fb650ac54e2f3bbc8f920292662906345e29ac Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v18 3/5] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 170 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 550 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 490f93959d1..c2f0a37eef2 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index bc11970bfee..a27274bfb4d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3439,6 +3439,13 @@ AutoVacuumReserveParallelWorkers(int *nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved = *nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
@@ -3468,6 +3475,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4c6d56d97d8..bfe365fa575 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1b31c5b98d6..01a3e3ec044 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..8bf153d132c
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,170 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+ qr/workers launched = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v18-0005-Cost-based-parameters-propagation-for-parallel-a.patch (16.6K, 6-v18-0005-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From 14abdef918a73e465900f758204de19982fc4224 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 7 Jan 2026 16:03:20 +0700
Subject: [PATCH v18 5/5] Cost-based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 26 +++-
src/backend/commands/vacuumparallel.c | 130 ++++++++++++++++++
src/include/commands/vacuum.h | 2 +
src/test/modules/test_autovacuum/Makefile | 2 +
.../modules/test_autovacuum/t/001_basic.pl | 83 +++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 ++
.../modules/test_autovacuum/test_autovacuum.c | 75 ++++++++++
7 files changed, 328 insertions(+), 2 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index aa4fbec143f..4c40a36523a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2430,8 +2430,24 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ /*
+ * If we are autovacuum parallel worker, check whether cost-based
+ * parameters had changed in leader worker.
+ * If so, vacuum_cost_delay and vacuum_cost_limit will be set to the
+ * values which leader worker is operating on.
+ *
+ * Do it before checking VacuumCostActive, because its value might be
+ * changed after leader's parameters consumption.
+ */
+ parallel_vacuum_fix_cost_based_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2445,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * If we are parallel autovacuum leader and some of cost-based
+ * parameters had changed, let other parallel workers know.
+ */
+ parallel_vacuum_propagate_cost_based_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c2f0a37eef2..06ecffeec42 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -54,6 +54,22 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Only autovacuum leader can reload config file. We use this structure in
+ * parallel autovacuum for keeping worker's parameters in sync with leader's
+ * parameters.
+ */
+typedef struct PVSharedCostParams
+{
+ slock_t spinlock; /* protects all fields below */
+
+ /* Copies of corresponding parameters from autovacuum leader process */
+ double cost_delay;
+ int cost_limit;
+} PVSharedCostParams;
+
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -123,6 +139,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool am_parallel_autovacuum;
+
+ /*
+ * Struct for syncing parameters between supportive parallel autovacuum
+ * workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -396,6 +424,17 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->am_parallel_autovacuum = AmAutoVacuumWorkerProcess();
+
+ if (shared->am_parallel_autovacuum)
+ {
+ shared->cost_params.cost_delay = vacuum_cost_delay;
+ shared->cost_params.cost_limit = vacuum_cost_limit;
+ SpinLockInit(&shared->cost_params.spinlock);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -538,6 +577,53 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
+/*
+ * Function to be called from parallel autovacuum worker in order to sync
+ * some cost-based delay parameter with the leader worker.
+ */
+bool
+parallel_vacuum_fix_cost_based_params(void)
+{
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return false;
+
+ Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ vacuum_cost_delay = pv_shared_cost_params->cost_delay;
+ vacuum_cost_limit = pv_shared_cost_params->cost_limit;
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
+
+ if (vacuum_cost_delay > 0 && !VacuumFailsafeActive)
+ VacuumCostActive = true;
+
+ return true;
+}
+
+/*
+ * Function to be called from parallel autovacuum leader in order to propagate
+ * some cost-based parameters to the supportive workers.
+ */
+void
+parallel_vacuum_propagate_cost_based_params(void)
+{
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ pv_shared_cost_params->cost_delay = vacuum_cost_delay;
+ pv_shared_cost_params->cost_limit = vacuum_cost_limit;
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -763,12 +849,26 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
+ /*
+ * To be able to exercise whether leader parallel autovacuum worker can
+ * propagate cost-based params to parallel workers, wait here until
+ * configuration is changed...
+ */
+ INJECTION_POINT("av-leader-before-reload-conf", NULL);
+
/*
* Join as a parallel worker. The leader vacuums alone processes all
* parallel-safe indexes in the case where no workers are launched.
*/
parallel_vacuum_process_safe_indexes(pvs);
+ /*
+ * ...and then wait until leader guaranteed to propagate new parameters
+ * values to the workers. I.e. tests are expecting, that during processing
+ * of parallel safe index we have called vacuum_delay_point,
+ */
+ INJECTION_POINT("av-leader-after-reload-conf", NULL);
+
/*
* Next, accumulate buffer and WAL usage. (This must wait for the workers
* to finish, or we might get incomplete data.)
@@ -1104,6 +1204,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->am_parallel_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
@@ -1131,6 +1234,33 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Prepare to track buffer usage during parallel execution */
InstrStartParallelQuery();
+#ifdef USE_INJECTION_POINTS
+ if (shared->am_parallel_autovacuum)
+ {
+ Assert(VacuumActiveNWorkers != NULL);
+
+ /*
+ * To be able to exercise whether leader parallel autovacuum worker can
+ * propagate cost-based params to parallel workers, wait here until
+ * configuration is changed and leader workers had updated shared state.
+ */
+ INJECTION_POINT("av-worker-before-reload-conf", NULL);
+
+ /* Simulate config reload during normal processing */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+ vacuum_delay_point(false);
+ pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+ /*
+ * Wait until worker guaranteed to consume new parameters values from
+ * the leader and save new value in injection point state.
+ */
+ INJECTION_POINT("autovacuum-set-cost-based-parameter",
+ &vacuum_cost_delay);
+ INJECTION_POINT("av-worker-after-reload-conf", NULL);
+ }
+#endif
+
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index ec5d70aacdc..73125439bed 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -411,6 +411,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern bool parallel_vacuum_fix_cost_based_params(void);
+extern void parallel_vacuum_propagate_cost_based_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
index 4cf7344b2ac..32254c53a5d 100644
--- a/src/test/modules/test_autovacuum/Makefile
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -12,6 +12,8 @@ DATA = test_autovacuum--1.0.sql
TAP_TESTS = 1
+EXTRA_INSTALL = src/test/modules/injection_points
+
export enable_injection_points
ifdef USE_PGXS
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
index 8bf153d132c..eec0f41b6a6 100644
--- a/src/test/modules/test_autovacuum/t/001_basic.pl
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -28,6 +28,11 @@ $node->append_conf('postgresql.conf', qq{
});
$node->start;
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
my $indexes_num = 4;
my $initial_rows_num = 10_000;
my $autovacuum_parallel_workers = 2;
@@ -73,6 +78,9 @@ $node->safe_psql('postgres', qq{
CREATE EXTENSION test_autovacuum;
SELECT inj_set_free_workers_attach();
SELECT inj_leader_failure_attach();
+ SELECT inj_check_av_param_attach();
+
+ CREATE EXTENSION injection_points;
});
# Test 1 :
@@ -166,5 +174,80 @@ $node->safe_psql('postgres', qq{
SELECT inj_leader_failure_detach();
});
+# Test 4:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+# Disable autovacuum on table during preparation for the next test
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_4 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('av-leader-before-reload-conf', 'wait');
+ SELECT injection_points_attach('av-leader-after-reload-conf', 'wait');
+ SELECT injection_points_attach('av-worker-before-reload-conf', 'wait');
+ SELECT injection_points_attach('av-worker-after-reload-conf', 'wait');
+});
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until leader parallel worker get to the point before vacuum_delay_point
+# and change cost-based config parameter.
+
+$node->wait_for_event('autovacuum worker', 'av-leader-before-reload-conf');
+$node->psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_vacuum_cost_delay = 10;
+ SELECT pg_reload_conf();
+});
+$node->psql('postgres', qq{
+ SELECT injection_points_wakeup('av-leader-before-reload-conf');
+});
+
+# Wait until leader worker propagates new patameter's value to the other
+# workers and let them to call vacuum_delay_point
+
+$node->wait_for_event('autovacuum worker', 'av-leader-after-reload-conf');
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('av-leader-after-reload-conf');
+ SELECT injection_points_wakeup('av-worker-before-reload-conf');
+});
+
+# Check whether parallel worker has consume new parameter's value from the
+# leader.
+# Aactually, it can happen before worker gets to the injection point, but we
+# want to make everything as deterministic as possible.
+
+$node->wait_for_event('parallel worker', 'av-worker-after-reload-conf');
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_worker_param_value('vacuum_cost_delay');",
+ stdout => \$psql_out,
+);
+is($psql_out, 10.0, 'Leader successfully propagated parameter value');
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('av-worker-after-reload-conf');
+});
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('av-leader-before-reload-conf');
+ SELECT injection_points_detach('av-leader-after-reload-conf');
+ SELECT injection_points_detach('av-worker-before-reload-conf');
+ SELECT injection_points_detach('av-worker-after-reload-conf');
+ SELECT inj_check_av_param_detach();
+
+ DROP EXTENSION test_autovacuum;
+ DROP EXTENSION injection_points;
+});
+
$node->stop;
done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
index 017d5da85ea..cb0407952d7 100644
--- a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -14,6 +14,10 @@ CREATE FUNCTION trigger_leader_failure(failure_type text)
RETURNS VOID STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+CREATE FUNCTION get_parallel_autovacuum_worker_param_value(param_name text)
+RETURNS FLOAT8 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
/*
* Injection point related functions
*/
@@ -32,3 +36,11 @@ AS 'MODULE_PATHNAME' LANGUAGE C;
CREATE FUNCTION inj_leader_failure_detach()
RETURNS VOID STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_check_av_param_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_check_av_param_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
index 7948f4858ae..e96cfda7ae9 100644
--- a/src/test/modules/test_autovacuum/test_autovacuum.c
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -38,6 +38,9 @@ typedef struct InjPointState
bool enabled_leader_failure;
AVLeaderFaulureType ftype;
+
+ bool enabled_check_av_param;
+ double vacuum_cost_delay;
} InjPointState;
static InjPointState * inj_point_state;
@@ -92,6 +95,12 @@ test_autovacuum_shmem_startup(void)
"inj_trigger_leader_failure",
NULL,
0);
+
+ InjectionPointAttach("autovacuum-set-cost-based-parameter",
+ "test_autovacuum",
+ "inj_set_av_parameter",
+ NULL,
+ 0);
}
LWLockRelease(AddinShmemInitLock);
@@ -109,6 +118,9 @@ _PG_init(void)
shmem_startup_hook = test_autovacuum_shmem_startup;
}
+extern PGDLLEXPORT void inj_set_av_parameter(const char *name,
+ const void *private_data,
+ void *arg);
extern PGDLLEXPORT void inj_set_free_workers(const char *name,
const void *private_data,
void *arg);
@@ -205,6 +217,45 @@ trigger_leader_failure(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/*
+ * Set current setting of "vacuum_cost_delay" parameter.
+ *
+ * Function is called from parallel autovacuum worker.
+ */
+void
+inj_set_av_parameter(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set autovacuum parameter injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_check_av_param)
+ {
+ Assert(arg != NULL);
+ inj_point_state->vacuum_cost_delay = *(double *) arg;
+ }
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_worker_param_value);
+Datum
+get_parallel_autovacuum_worker_param_value(PG_FUNCTION_ARGS)
+{
+ const char *param_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ double value = 0.0;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(param_name, "vacuum_cost_delay") == 0)
+ value = inj_point_state->vacuum_cost_delay;
+ else
+ elog(ERROR,
+ "cannot retrieve parameter %s from injection point", param_name);
+
+ PG_RETURN_FLOAT8((float8) value);
+}
+
PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
Datum
inj_set_free_workers_attach(PG_FUNCTION_ARGS)
@@ -253,3 +304,27 @@ inj_leader_failure_detach(PG_FUNCTION_ARGS)
#endif
PG_RETURN_VOID();
}
+
+PG_FUNCTION_INFO_V1(inj_check_av_param_attach);
+Datum
+inj_check_av_param_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_check_av_param = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_check_av_param_detach);
+Datum
+inj_check_av_param_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_check_av_param = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-07 13:51 =?ISO-8859-1?B?emVuZ21hbg==?= <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: =?ISO-8859-1?B?emVuZ21hbg==?= @ 2026-01-07 13:51 UTC (permalink / raw)
To: =?ISO-8859-1?B?RGFuaWlsIERhdnlkb3Y=?= <[email protected]>; +Cc: pgsql-hackers
Hi,
I noticed one thing: autovacuum_max_parallel_workers is initialized to 0 in globals.c,
but its GUC default (boot_val) is '2' in guc_parameters.dat. While GUC overrides it on startup,
this mismatch may cause confusion. Perhaps we should modify this to match the approach for max_parallel_workers.
--
Regards,
Man Zeng
www.openhalo.org
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-07 20:52 Daniil Davydov <[email protected]>
parent: =?ISO-8859-1?B?emVuZ21hbg==?= <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2026-01-07 20:52 UTC (permalink / raw)
To: zengman <[email protected]>; +Cc: pgsql-hackers
Hi,
On Wed, Jan 7, 2026 at 8:51 PM zengman <[email protected]> wrote:
>
> I noticed one thing: autovacuum_max_parallel_workers is initialized to 0 in globals.c,
> but its GUC default (boot_val) is '2' in guc_parameters.dat. While GUC overrides it on startup,
> this mismatch may cause confusion. Perhaps we should modify this to match the approach for max_parallel_workers.
>
Good catch, thank you!
I'll fix it in the next version of the patch.
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-15 02:13 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-01-15 02:13 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed, Jan 7, 2026 at 1:51 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Tue, Jan 6, 2026 at 3:44 AM Daniil Davydov <[email protected]> wrote:
> >
> > On Tue, Jan 6, 2026 at 1:51 AM Masahiko Sawada <[email protected]> wrote:
> > >
> > > Are you still working on it? Or shall I draft this part on top of the
> > > 0001 patch?
> >
> > I thought about some "beautiful" approach, but for now I have
> > only one idea - force parallel a/v workers to get values for these
> > parameters from shmem (which obviously can be modified by the
> > leader a/v process). I'll post this patch in the near future.
> >
>
> I am posting a draft version of the patch (see 0005) that allows parallel
> leader to propagate changes of cost-based parameters to its parallel
> workers. It is a very rough fix, but it reflects my idea that is to have some
> shared state from which parallel workers can get values for the parameters
> (and which only leader worker can modify, obviously).
>
> I have also added a test that shows that this idea is working - the test
> ensures that parallel workers can change its parameters if they have been
> changed for the leader worker.
>
> Note that so far the work is in progress - this logic works only for
> vacuum_cost_delay and vacuum_cost_limits parameters. I think that we
> should agree on an idea first, and only then apply logic for all appropriate
> parameters.
>
> What do you think?
Thank you for updating the patches! Here are review comments.
* 0001 patch
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+
+ Assert(av_nworkers_reserved == 0);
+}
While adding the assertion here makes sense, the assertion won't work
in non-assertion builds. I guess it's safer to call
AutoVacuumReleaseAllParallelWorkers() regardless of the code to ensure
that no autovacuum workers exit while holding parallel workers.
---
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
I think it would be better to set this callback later like before the
main loop of processing the tables as it makes no sense even if we set
it very early.
---
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
Suppose the previous autovacuum_max_parallel_workers is 5 and there
are 2 workers are reserved (i.e., there are 3 free parallel workers),
if the autovacuum_max_parallel_workers changes to 2, the new
AutoVacuumShmem->av_freeParallelWorkers would be 2 based on the above
codes, but I believe that the new number of free workers should be 0
as there are already 2 workers are running. What do you think? I guess
we can calculate the new number of free workers by:
Max((autovacuum_max_parallel_workers - prev_max_parallel_workers) +
AutoVacuumShmem->av_freeParallelWorkers), 0)
---
I've attached a patch proposing some minor changes.
* 0002 patch
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
I think that LVRelState can have PVWorkersUsage instead of a pointer to it.
---
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
I think we can report the worker statistics even in VACUUM VERBOSE
logs. Currently VACUUM VERBOSE reports the worker usage just during
index vacuuming but it would make sense to report the overall
statistics in vacuum logs. It would help make VACUUM VERBOSE logs and
autovacuum logs consistent.
But we don't need to report the worker usage if we didn't use the
parallel vacuum (i.e., if npanned == 0).
---
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
This code runs after the attempt to reserve parallel workers.
Consequently, if we fail to reserve any workers due to
autovacuum_max_parallel_workers, we report the status as if parallel
vacuum wasn't planned at all. I think knowing the number of workers
that were planned but not reserved would provide valuable insight for
users tuning autovacuum_max_parallel_workers.
---
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup :
workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
Since these numbers are the total number of workers planned and
launched, how about changing it to something "parallel index
vacuum/cleanup: %d workers were planned and %d workers were launched
in total"?
* 0003 patch
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
I'm concerned that it is somewhat overwrapped with what injection
points does as we can set 'error' to injection_points_attach(). For
the FATAL error, we can terminate the autovacuum worker by using
pg_terminate_backend() that keeps waiting due to
injection_point_attach() with action='wait'.
---
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
This injection point is added to two places. IIUC the purpose of this
function is to update the free_parallel_workers of InjPointState. And
that value is taken by get_parallel_autovacuum_free_workers() SQL
function during the TAP test. I guess it's better to have
get_parallel_autovacuum_free_workers() function to direcly check
av_freeParallelWorkers with a proper locking.
---
It would be great if we could test the av_freeParallelWorkers
adjustment when max_parallel_maintenance_workers changes.
* 0005 patch
+typedef struct PVSharedCostParams
+{
+ slock_t spinlock; /* protects all fields below */
+
+ /* Copies of corresponding parameters from autovacuum leader process */
+ double cost_delay;
+ int cost_limit;
+} PVSharedCostParams;
Since Parallel workers don't reload the config file I think other
vacuum delay related parameters such as VacuumCostPage{Miss|Hit|Dirty}
also needs to be shared by the leader.
---
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ /*
+ * If we are autovacuum parallel worker, check whether cost-based
+ * parameters had changed in leader worker.
+ * If so, vacuum_cost_delay and vacuum_cost_limit will be set to the
+ * values which leader worker is operating on.
+ *
+ * Do it before checking VacuumCostActive, because its value might be
+ * changed after leader's parameters consumption.
+ */
+ parallel_vacuum_fix_cost_based_params();
+ }
We need to add checks to prevent the normal backend running the VACUUM
command from calling parallel_vacuum_fix_cost_based_params().
IIUC autovacuum parallel workers would call
parallel_vacuum_fix_cost_based_params() and update their
vacuum_cost_{delay|limit} every vacuum_delay_point().
---
+/*
+ * Function to be called from parallel autovacuum worker in order to sync
+ * some cost-based delay parameter with the leader worker.
+ */
+bool
+parallel_vacuum_fix_cost_based_params(void)
+{
The 'fix' doesn't sound right to me as it's not broken actually. How
about something like parallel_vacuum_update_shared_delay_params?
+ Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ vacuum_cost_delay = pv_shared_cost_params->cost_delay;
+ vacuum_cost_limit = pv_shared_cost_params->cost_limit;
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
IIUC autovacuum parallel workers seems to update their
vacuum_cost_{delay|limit} every vacuum_delay_point(), which seems not
good. Can we somehow avoid unnecessary updates?
---
+
+ if (vacuum_cost_delay > 0 && !VacuumFailsafeActive)
+ VacuumCostActive = true;
+
Should we consider the case of disabling VacuumCostActive as well?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[application/octet-stream] 0001_masahiko.patch (4.4K, 2-0001_masahiko.patch)
download | inline diff:
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 6a3a00585f9..cb42d4e572f 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -656,7 +656,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
nworkers = Min(nworkers, pvs->pcxt->nworkers);
/*
- * Reserve workers in autovacuum global state. Note, that we may be given
+ * Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
*/
if (AmAutoVacuumWorkerProcess() && nworkers > 0)
@@ -706,15 +706,12 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
- if (AmAutoVacuumWorkerProcess() &&
- pvs->pcxt->nworkers_launched < nworkers)
- {
- /*
- * Tell autovacuum that we could not launch all the previously
- * reserved workers.
- */
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched < nworkers)
AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
- }
if (pvs->pcxt->nworkers_launched > 0)
{
@@ -765,7 +762,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
- /* Also release all previously reserved parallel autovacuum workers */
+ /* Release all the reserved parallel workers for autovacuum */
if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index bc11970bfee..6ccc88c4e1e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -152,8 +152,9 @@ static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
/*
- * Variable to keep number of currently reserved parallel autovacuum workers.
- * It is only relevant for parallel autovacuum leader process.
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
*/
static int av_nworkers_reserved = 0;
@@ -3407,33 +3408,24 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
}
/*
- * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
- * autovacuum process must call this function during computing the parallel
- * degree.
+ * Reserves parallel workers for autovacuum.
*
- * 'nworkers' is the desired number of parallel workers to reserve. Function
- * sets 'nworkers' to the number of parallel workers that actually can be
- * launched and reserves these workers (if any) in global autovacuum state.
- *
- * NOTE: We will try to provide as many workers as requested, even if caller
- * will occupy all available workers.
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
*/
void
AutoVacuumReserveParallelWorkers(int *nworkers)
{
- /* Only leader worker can call this function. */
+ /* Only leader autovacuum worker can call this function. */
Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
- /*
- * We can only reserve workers at the beginning of parallel index
- * processing, so we must not have any reserved workers right now.
- */
+ /* The worker must not have any reserved workers yet */
Assert(av_nworkers_reserved == 0);
LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
/* Provide as many workers as we can. */
- *nworkers= Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
/* Remember how many workers we have reserved. */
@@ -3632,8 +3624,8 @@ check_av_worker_gucs(void)
}
/*
- * Make sure that number of free parallel workers corresponds to the
- * autovacuum_max_parallel_workers parameter (after it was changed).
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
*/
static void
adjust_free_parallel_workers(int prev_max_parallel_workers)
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-16 14:10 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-01-16 14:10 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Thu, Jan 15, 2026 at 9:13 AM Masahiko Sawada <[email protected]> wrote:
>
> Thank you for updating the patches! Here are review comments.
>
Thank you for the review!
>
> +static void
> +autovacuum_worker_before_shmem_exit(int code, Datum arg)
> +{
> + if (code != 0)
> + AutoVacuumReleaseAllParallelWorkers();
> +
> + Assert(av_nworkers_reserved == 0);
> +}
>
> While adding the assertion here makes sense, the assertion won't work
> in non-assertion builds. I guess it's safer to call
> AutoVacuumReleaseAllParallelWorkers() regardless of the code to ensure
> that no autovacuum workers exit while holding parallel workers.
>
OK, I agree.
> ---
> + before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
>
> I think it would be better to set this callback later like before the
> main loop of processing the tables as it makes no sense even if we set
> it very early.
Yeah, agree. I'll also add a comment for it, because we already have a
"ReleaseAllParallelWorkers" function call in the try/catch block below.
>
> ---
> + /*
> + * Cap the number of free workers by new parameter's value, if needed.
> + */
> + AutoVacuumShmem->av_freeParallelWorkers =
> + Min(AutoVacuumShmem->av_freeParallelWorkers,
> + autovacuum_max_parallel_workers);
> +
> + if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
> + {
> + /*
> + * If user wants to increase number of parallel autovacuum workers, we
> + * must increase number of free workers.
> + */
> + AutoVacuumShmem->av_freeParallelWorkers +=
> + (autovacuum_max_parallel_workers - prev_max_parallel_workers);
> + }
>
> Suppose the previous autovacuum_max_parallel_workers is 5 and there
> are 2 workers are reserved (i.e., there are 3 free parallel workers),
> if the autovacuum_max_parallel_workers changes to 2, the new
> AutoVacuumShmem->av_freeParallelWorkers would be 2 based on the above
> codes, but I believe that the new number of free workers should be 0
> as there are already 2 workers are running. What do you think? I guess
> we can calculate the new number of free workers by:
>
> Max((autovacuum_max_parallel_workers - prev_max_parallel_workers) +
> AutoVacuumShmem->av_freeParallelWorkers), 0)
>
If av_max_parallel_workers was changed to 2, then we not only set
freeParallelWorkers to 2 but also set maxParallelWorkers to 2.
Thus, when previously reserved two workers are released, av leader will
encounter this code:
/*
* If the maximum number of parallel workers was reduced during execution,
* we must cap available workers number by its new value.
*/
AutoVacuumShmem->av_freeParallelWorkers =
Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
AutoVacuumShmem->av_maxParallelWorkers);
I.e. freeParallelWorkers will be left as "2".
The formula you suggested is also correct, but if you have no objections,
I would prefer not to change the existing logic. It seems more reliable for
me when av leader explicitly can consider such a situation.
> ---
> I've attached a patch proposing some minor changes.
>
Thanks! I agree with all fixes except a single one:
- * NOTE: We will try to provide as many workers as requested, even if caller
- * will occupy all available workers.
I think that this is a pretty important point. I'll leave this NOTE in the
v19 patch set. Do you mind?
>
> + /*
> + * Number of planned and actually launched parallel workers for all index
> + * scans, or NULL
> + */
> + PVWorkersUsage *workers_usage;
>
> I think that LVRelState can have PVWorkersUsage instead of a pointer to it.
>
Previously I used the NULL value of this pointer as a flag that we don't need
to log workers usage. Now I'll add boolean flag for this purpose (IIUC,
"nplanned > 0" condition is not enough to determine whether we should log
workers usage, because VACUUM PARALLEL can be called without VERBOSE).
> ---
> + /*
> + * Allocate space for workers usage statistics. Thus, we explicitly
> + * make clear that such statistics must be accumulated. For now, this
> + * is used only by autovacuum leader worker, because it must log it in
> + * the end of table processing.
> + */
> + vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
> + (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
> + NULL;
>
> I think we can report the worker statistics even in VACUUM VERBOSE
> logs. Currently VACUUM VERBOSE reports the worker usage just during
> index vacuuming but it would make sense to report the overall
> statistics in vacuum logs. It would help make VACUUM VERBOSE logs and
> autovacuum logs consistent.
>
Agree.
> But we don't need to report the worker usage if we didn't use the
> parallel vacuum (i.e., if npanned == 0).
>
As I wrote above - we don't need to log workers usage if the VERBOSE option
is not specified (even if nplanned > 0). Am I missing something?
> ---
> + /* Remember these values, if we asked to. */
> + if (wusage != NULL)
> + {
> + wusage->nlaunched += pvs->pcxt->nworkers_launched;
> + wusage->nplanned += nworkers;
> + }
>
> This code runs after the attempt to reserve parallel workers.
> Consequently, if we fail to reserve any workers due to
> autovacuum_max_parallel_workers, we report the status as if parallel
> vacuum wasn't planned at all. I think knowing the number of workers
> that were planned but not reserved would provide valuable insight for
> users tuning autovacuum_max_parallel_workers.
>
100% agree.
> ---
> + if (vacrel->workers_usage)
> + appendStringInfo(&buf,
> + _("parallel index vacuum/cleanup :
> workers planned = %d, workers launched = %d\n"),
> + vacrel->workers_usage->nplanned,
> + vacrel->workers_usage->nlaunched);
>
> Since these numbers are the total number of workers planned and
> launched, how about changing it to something "parallel index
> vacuum/cleanup: %d workers were planned and %d workers were launched
> in total"?
>
Agree.
>
> +typedef enum AVLeaderFaulureType
> +{
> + FAIL_NONE,
> + FAIL_ERROR,
> + FAIL_FATAL,
> +} AVLeaderFaulureType;
>
> I'm concerned that it is somewhat overwrapped with what injection
> points does as we can set 'error' to injection_points_attach(). For
> the FATAL error, we can terminate the autovacuum worker by using
> pg_terminate_backend() that keeps waiting due to
> injection_point_attach() with action='wait'.
>
Oh, I didn't know about the possibility of testing FATAL errors with
pg_terminate_backend. After reading your letter I found this pattern
in signal_autovacuum.pl. This is beautiful.
Thank you, I'll rework these tests.
> ---
> + /*
> + * Injection point to help exercising number of available parallel
> + * autovacuum workers.
> + */
> + INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
> + &AutoVacuumShmem->av_freeParallelWorkers);
>
> This injection point is added to two places. IIUC the purpose of this
> function is to update the free_parallel_workers of InjPointState. And
> that value is taken by get_parallel_autovacuum_free_workers() SQL
> function during the TAP test. I guess it's better to have
> get_parallel_autovacuum_free_workers() function to direcly check
> av_freeParallelWorkers with a proper locking.
>
Agree.
> ---
> It would be great if we could test the av_freeParallelWorkers
> adjustment when max_parallel_maintenance_workers changes.
>
You mean "when autovacuum_max_parallel_workers changes"?
I'll add a test for it.
>
> * 0005 patch
>
> +typedef struct PVSharedCostParams
> +{
> + slock_t spinlock; /* protects all fields below */
> +
> + /* Copies of corresponding parameters from autovacuum leader process */
> + double cost_delay;
> + int cost_limit;
> +} PVSharedCostParams;
>
> Since Parallel workers don't reload the config file I think other
> vacuum delay related parameters such as VacuumCostPage{Miss|Hit|Dirty}
> also needs to be shared by the leader.
>
Yes, I remember it. I didn't add them in the previous patch because it was
experimental. I'll add all appropriate parameters in v19.
> ---
> + if (!AmAutoVacuumWorkerProcess())
> + {
> + /*
> + * If we are autovacuum parallel worker, check whether cost-based
> + * parameters had changed in leader worker.
> + * If so, vacuum_cost_delay and vacuum_cost_limit will be set to the
> + * values which leader worker is operating on.
> + *
> + * Do it before checking VacuumCostActive, because its value might be
> + * changed after leader's parameters consumption.
> + */
> + parallel_vacuum_fix_cost_based_params();
> + }
>
> We need to add checks to prevent the normal backend running the VACUUM
> command from calling parallel_vacuum_fix_cost_based_params().
>
We already have such check inside the "fix_cost_based" function :
/* Check whether we are running parallel autovacuum */
if (pv_shared_cost_params == NULL)
return false;
We also have this comment:
* If we are autovacuum parallel worker, check whether cost-based
* parameters had changed in leader worker.
As an alternative, I'll add comment explicitly saying that process will
immediately return if it not parallel autovacuum participant.
> IIUC autovacuum parallel workers would call
> parallel_vacuum_fix_cost_based_params() and update their
> vacuum_cost_{delay|limit} every vacuum_delay_point().
>
Yep.
> ---
> +/*
> + * Function to be called from parallel autovacuum worker in order to sync
> + * some cost-based delay parameter with the leader worker.
> + */
> +bool
> +parallel_vacuum_fix_cost_based_params(void)
> +{
>
> The 'fix' doesn't sound right to me as it's not broken actually. How
> about something like parallel_vacuum_update_shared_delay_params?
>
Agree.
> + Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
> +
> + SpinLockAcquire(&pv_shared_cost_params->spinlock);
> +
> + vacuum_cost_delay = pv_shared_cost_params->cost_delay;
> + vacuum_cost_limit = pv_shared_cost_params->cost_limit;
> +
> + SpinLockRelease(&pv_shared_cost_params->spinlock);
>
> IIUC autovacuum parallel workers seems to update their
> vacuum_cost_{delay|limit} every vacuum_delay_point(), which seems not
> good. Can we somehow avoid unnecessary updates?
More precisely, parallel worker *reads* leader's parameters every delay_point.
Obviously, this does not mean that the parameters will necessarily be updated.
But I don't see anything wrong with this logic. We just every time get the most
relevant parameters from the leader. Of course we can introduce some
signaling mechanism, but it will have the same effect as in the current code.
> ---
> +
> + if (vacuum_cost_delay > 0 && !VacuumFailsafeActive)
> + VacuumCostActive = true;
> +
>
> Should we consider the case of disabling VacuumCostActive as well?
>
I think that we should. I'll add VacuumUpdateCosts function call instead
of write this logic manually. IIUC, it will not break anything.
Again, thank you very much for the review!
Please, see v19 patches which including all above comments
and zengman's notice. Main changes :
1) Fixes for before_shmem_exit callback
2) Some comments reword + pgindent on all files
3) Workers usage can also be reported for VACUUM PARALLEL
4) Deeply reworked tests
5) Propagation (from leader to worker) of all cost-based delay parameters
I have also changed structure of the patch set - now test and documentation
are the last patches to be applied.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v19-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 2-v19-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 96180a1b4a78c6202c95131afee2b75be8fcc534 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v19 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0fad34da6eb..c64897f4707 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2849,6 +2849,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9284,6 +9285,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 77c5a763d45..3592c9acff9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v19-0003-Cost-based-parameters-propagation-for-parallel-a.patch (7.4K, 3-v19-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From ba30e217073cee821cd842f12ed91e4c10adb255 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v19 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 29 +++++++-
src/backend/commands/vacuumparallel.c | 99 +++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
4 files changed, 129 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index aa4fbec143f..4622107734f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2430,8 +2430,27 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ /*
+ * If we are parallel *autovacuum* worker, check whether related to
+ * cost-based delay parameters had changed in the leader worker. If
+ * so, corresponding parameters will be updated to the values which
+ * leader worker is operating on.
+ *
+ * Do it before checking VacuumCostActive, because its value might be
+ * changed after leader's parameters consumption.
+ *
+ * Note, that this function has no effect if we are non-autovacuum
+ * parallel worker.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2445,6 +2464,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * If we are parallel autovacuum leader and some of cost-based
+ * parameters had changed, let other parallel workers know.
+ */
+ parallel_vacuum_propagate_cost_based_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c32314f9731..71449630b63 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -53,6 +53,25 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Only autovacuum leader can reload config file. We use this structure in
+ * parallel autovacuum for keeping worker's parameters in sync with leader's
+ * parameters.
+ */
+typedef struct PVSharedCostParams
+{
+ slock_t spinlock; /* protects all fields below */
+
+ /* Copies of corresponding parameters from autovacuum leader process */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
+static PVSharedCostParams * pv_shared_cost_params = NULL;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -122,6 +141,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool am_parallel_autovacuum;
+
+ /*
+ * Struct for syncing parameters between supportive parallel autovacuum
+ * workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -395,6 +426,19 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->am_parallel_autovacuum = AmAutoVacuumWorkerProcess();
+
+ if (shared->am_parallel_autovacuum)
+ {
+ shared->cost_params.cost_delay = vacuum_cost_delay;
+ shared->cost_params.cost_limit = vacuum_cost_limit;
+ shared->cost_params.cost_page_dirty = VacuumCostPageDirty;
+ shared->cost_params.cost_page_hit = VacuumCostPageHit;
+ shared->cost_params.cost_page_miss = VacuumCostPageMiss;
+ SpinLockInit(&shared->cost_params.spinlock);
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -537,6 +581,58 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
+/*
+ * Function to be called from parallel autovacuum worker in order to sync
+ * some cost-based delay parameter with the leader worker.
+ */
+bool
+parallel_vacuum_update_shared_delay_params(void)
+{
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return false;
+
+ Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
+
+ VacuumUpdateCosts();
+
+ return true;
+}
+
+/*
+ * Function to be called from parallel autovacuum leader in order to propagate
+ * some cost-based parameters to the supportive workers.
+ */
+void
+parallel_vacuum_propagate_cost_based_params(void)
+{
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ pv_shared_cost_params->cost_delay = vacuum_cost_delay;
+ pv_shared_cost_params->cost_limit = vacuum_cost_limit;
+ pv_shared_cost_params->cost_page_dirty = VacuumCostPageDirty;
+ pv_shared_cost_params->cost_page_hit = VacuumCostPageHit;
+ pv_shared_cost_params->cost_page_miss = VacuumCostPageMiss;
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1094,6 +1190,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->am_parallel_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 097b1dd55cf..98965fd8e2d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1693,7 +1693,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index ec5d70aacdc..09696a8eafe 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -411,6 +411,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern bool parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_cost_based_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
--
2.43.0
[text/x-patch] v19-0001-Parallel-autovacuum.patch (19.4K, 4-v19-0001-Parallel-autovacuum.patch)
download | inline diff:
From c961c20ba124ad925711df79a82ef4026b920724 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v19 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 39 ++++-
src/backend/postmaster/autovacuum.c | 159 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 232 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0b83f98ed5f..692ac46733e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c3b3c9ea21a..cb42d4e572f 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched < nworkers)
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +761,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 22379de1e31..097b1dd55cf 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,13 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +292,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -361,6 +372,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -760,6 +773,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -776,6 +791,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1380,6 +1404,16 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -2277,6 +2311,12 @@ do_autovacuum(void)
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that all
+ * reserved workers are released even after FATAL error.
+ */
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Perform operations on collected tables.
*/
@@ -2458,6 +2498,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2858,8 +2904,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3336,6 +3386,76 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Reserves parallel workers for autovacuum.
+ *
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader autovacuum worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /* The worker must not have any reserved workers yet */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3396,6 +3516,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3477,3 +3601,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae9d5f3fb70..c8a99a67767 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 7c60b125564..e933f5048f7 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8b91bc00062..ed59a21289c 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index db559b39c4d..ad6e19f426c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..3f5b59a15bd 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,6 +62,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d03ab247788..c1d882659f9 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v19-0002-Logging-for-parallel-autovacuum.patch (8.4K, 5-v19-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 895131049b2416566781ff585a577bc0342f5f71 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v19 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 21 +++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 55 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1fcb212ab3d..0be33cb84a6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -630,6 +636,7 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
LVRelState *vacrel;
bool verbose,
instrument,
+ log_workers_usage = false, /* for parallel [auto]vacuum only */
skipwithvm,
frozenxid_updated,
minmulti_updated;
@@ -709,6 +716,12 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc_array(char *, vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Worker usage statistics must be accumulated for parallel autovacuum
+ * and for VACUUM (PARALLEL, VERBOSE).
+ */
+ log_workers_usage = (params.nworkers > -1);
}
/*
@@ -781,6 +794,9 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->vm_new_visible_frozen_pages = 0;
vacrel->vm_new_frozen_pages = 0;
+ vacrel->workers_usage.nlaunched = 0;
+ vacrel->workers_usage.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1139,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (log_workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup: %d workers were planned and %d workers were launched in total\n"),
+ vacrel->workers_usage.nplanned,
+ vacrel->workers_usage.nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2698,7 +2719,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3131,7 +3153,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index cb42d4e572f..c32314f9731 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -655,6 +656,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wusage != NULL && nworkers > 0)
+ wusage->nplanned += nworkers;
+
/*
* Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
@@ -725,6 +730,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wusage != NULL)
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..ec5d70aacdc 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3f3a888fd0e..afebde72235 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2404,6 +2404,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[text/x-patch] v19-0004-Tests-for-parallel-autovacuum.patch (23.8K, 6-v19-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From cf06a408afe33373a802fb16507c91850244e638 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v19 4/5] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 29 ++
src/backend/postmaster/autovacuum.c | 17 +
src/include/postmaster/autovacuum.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 28 ++
src/test/modules/test_autovacuum/meson.build | 36 ++
.../modules/test_autovacuum/t/001_basic.pl | 325 ++++++++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 20 ++
.../modules/test_autovacuum/test_autovacuum.c | 166 +++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
12 files changed, 629 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 71449630b63..1ead6e1193b 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -846,6 +847,14 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -855,6 +864,15 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
parallel_vacuum_process_safe_indexes(pvs);
+ /*
+ * To be able to exercise whether leader parallel autovacuum worker can
+ * propagate cost-based params to parallel workers, wait here until
+ * configuration is changed. I.e. tests are expecting, that during index
+ * processing vacuum_delay_point have been called (if config was changed).
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-after-indexes-processing", NULL);
+
/*
* Next, accumulate buffer and WAL usage. (This must wait for the workers
* to finish, or we might get incomplete data.)
@@ -1220,9 +1238,20 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Prepare to track buffer usage during parallel execution */
InstrStartParallelQuery();
+ INJECTION_POINT("parallel-worker-before-indexes-processing", NULL);
+
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
+ /*
+ * There is no guarantee that each parallel worker will necessarily
+ * process at least one index. Thus, at this point we cannot be sure that
+ * worker called vacuum_cost_delay. In order to test cost-based parameters
+ * propagation (from leader worker), call vacuum_delay_point here, if
+ * injection point is active.
+ */
+ INJECTION_POINT("parallel-autovacuum-force-delay-point", NULL);
+
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 98965fd8e2d..db99241df3e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3456,6 +3456,23 @@ AutoVacuumReleaseAllParallelWorkers(void)
AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
}
+/*
+ * Get number of free autovacuum parallel workers.
+ *
+ * For testing purpose only!
+ */
+uint32
+AutoVacuumGetFreeParallelWorkers(void)
+{
+ uint32 nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
+ LWLockRelease(AutovacuumLock);
+
+ return nfree_workers;
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 3f5b59a15bd..f50c7462cd4 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
/* parallel autovacuum stuff */
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern uint32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4c6d56d97d8..bfe365fa575 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1b31c5b98d6..01a3e3ec044 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..32254c53a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..369d2905a2b
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,325 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ });
+
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ ANALYZE test_autovac;
+ });
+}
+
+sub wait_for_av_log
+{
+ my ($node, $expected_log) = @_;
+
+ $node->wait_for_log($expected_log);
+ truncate $node->logfile, 0 or die "truncate failed: $!";
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_av_log($node,
+ qr/parallel index vacuum\/cleanup: 2 workers were planned / .
+ qr/and 2 workers were launched in total/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 20, 'All parallel workers has been released by the leader');
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-after-indexes-processing', 'wait');
+ SELECT injection_points_attach('parallel-worker-before-indexes-processing', 'wait');
+ SELECT inj_force_delay_point_attach();
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum leader launches parallel worker and falls
+# asleep on the injection point
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until leader worker is guaranteed to update parameters and propagate
+# their values to the parallel worker
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-after-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-after-indexes-processing');
+});
+
+# Now wake up the parallel worker and force it to call vacuum_delay_point
+$node->wait_for_event(
+ 'parallel worker',
+ 'parallel-worker-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('parallel-worker-before-indexes-processing');
+});
+
+# Check whether worker successfully updated all parameters
+wait_for_av_log($node,
+ qr/Vacuum cost-based delay parameters of parallel worker:\n/ .
+ qr/\tvacuum_cost_limit = 500\n/ .
+ qr/\tvacuum_cost_delay = 2\n/ .
+ qr/\tvacuum_cost_page_miss = 10\n/ .
+ qr/\tvacuum_cost_page_dirty = 10\n/ .
+ qr/\tvacuum_cost_page_hit = 10\n/);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ SELECT injection_points_detach('autovacuum-leader-after-indexes-processing');
+ SELECT injection_points_detach('parallel-worker-before-indexes-processing');
+ SELECT inj_force_delay_point_detach();
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+
+# Test 3:
+# Test adjustment of free parallel workers number when changing
+# autovacuum_max_parallel_workers parameter
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until the end of parallel processing
+wait_for_av_log($node,
+ qr/parallel index vacuum\/cleanup: 2 workers were planned / .
+ qr/and 2 workers were launched in total/);
+
+# When all parallel workers were released, the number of free parallel workers
+# must not exceed autovacuum_max_parallel_workers limit
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'Number of free parallel workers is consistent');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 4:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'error');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+wait_for_av_log($node,
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 5:
+# Same as above test, but simulate situation, when leader exites due to FATAL.
+
+prepare_for_next_test($node, 5);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+my $av_pid = $node->safe_psql('postgres', qq{
+ SELECT pid FROM pg_stat_activity
+ WHERE backend_type = 'autovacuum worker'
+ AND wait_event = 'autovacuum-leader-before-indexes-processing'
+ LIMIT 1;
+});
+
+# Create role with pg_signal_autovacuum_worker for terminating autovacuum worker.
+$node->safe_psql('postgres', qq{
+ CREATE ROLE regress_worker_role;
+ GRANT pg_signal_autovacuum_worker TO regress_worker_role;
+ SET ROLE regress_worker_role;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
+
+wait_for_av_log($node,
+ qr/terminating autovacuum process due to administrator command/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..679375fc82f
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,20 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting shared autovacuum state
+ */
+
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_force_delay_point_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_force_delay_point_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..45050924f17
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,166 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "commands/vacuum.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef struct InjPointState
+{
+ bool enabled_force_delay_point;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_force_delay_point = false;
+
+ InjectionPointAttach("parallel-autovacuum-force-delay-point",
+ "test_autovacuum",
+ "inj_force_delay_point",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_force_delay_point(const char *name,
+ const void *private_data,
+ void *arg);
+
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_UINT32(nfree_workers);
+}
+
+/*
+ */
+void
+inj_force_delay_point(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("force delay point injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_force_delay_point)
+ {
+ StringInfoData buf;
+
+ Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
+
+ /* Simulate config reload during normal processing */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+ vacuum_delay_point(false);
+ pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+ initStringInfo(&buf);
+
+ appendStringInfo(&buf, "Vacuum cost-based delay parameters of parallel worker:\n");
+ appendStringInfo(&buf, "vacuum_cost_limit = %d\n", vacuum_cost_limit);
+ appendStringInfo(&buf, "vacuum_cost_delay = %g\n", vacuum_cost_delay);
+ appendStringInfo(&buf, "vacuum_cost_page_miss = %d\n", VacuumCostPageMiss);
+ appendStringInfo(&buf, "vacuum_cost_page_dirty = %d\n", VacuumCostPageDirty);
+ appendStringInfo(&buf, "vacuum_cost_page_hit = %d\n", VacuumCostPageHit);
+
+ ereport(LOG, errmsg("%s", buf.data));
+ pfree(buf.data);
+ }
+}
+
+PG_FUNCTION_INFO_V1(inj_force_delay_point_attach);
+Datum
+inj_force_delay_point_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_force_delay_point = true;
+#else
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_force_delay_point_detach);
+Datum
+inj_force_delay_point_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_force_delay_point = false;
+#else
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-16 22:20 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-01-16 22:20 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Fri, Jan 16, 2026 at 6:11 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Thu, Jan 15, 2026 at 9:13 AM Masahiko Sawada <[email protected]> wrote:
> >
> >
> > ---
> > + /*
> > + * Cap the number of free workers by new parameter's value, if needed.
> > + */
> > + AutoVacuumShmem->av_freeParallelWorkers =
> > + Min(AutoVacuumShmem->av_freeParallelWorkers,
> > + autovacuum_max_parallel_workers);
> > +
> > + if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
> > + {
> > + /*
> > + * If user wants to increase number of parallel autovacuum workers, we
> > + * must increase number of free workers.
> > + */
> > + AutoVacuumShmem->av_freeParallelWorkers +=
> > + (autovacuum_max_parallel_workers - prev_max_parallel_workers);
> > + }
> >
> > Suppose the previous autovacuum_max_parallel_workers is 5 and there
> > are 2 workers are reserved (i.e., there are 3 free parallel workers),
> > if the autovacuum_max_parallel_workers changes to 2, the new
> > AutoVacuumShmem->av_freeParallelWorkers would be 2 based on the above
> > codes, but I believe that the new number of free workers should be 0
> > as there are already 2 workers are running. What do you think? I guess
> > we can calculate the new number of free workers by:
> >
> > Max((autovacuum_max_parallel_workers - prev_max_parallel_workers) +
> > AutoVacuumShmem->av_freeParallelWorkers), 0)
> >
>
> If av_max_parallel_workers was changed to 2, then we not only set
> freeParallelWorkers to 2 but also set maxParallelWorkers to 2.
> Thus, when previously reserved two workers are released, av leader will
> encounter this code:
>
> /*
> * If the maximum number of parallel workers was reduced during execution,
> * we must cap available workers number by its new value.
> */
> AutoVacuumShmem->av_freeParallelWorkers =
> Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
> AutoVacuumShmem->av_maxParallelWorkers);
>
> I.e. freeParallelWorkers will be left as "2".
>
> The formula you suggested is also correct, but if you have no objections,
> I would prefer not to change the existing logic. It seems more reliable for
> me when av leader explicitly can consider such a situation.
Looking at AutoVacuumReserveParallelWorkers(), it seems that we don't
check the av_maxParallelWorkers() there. Is it possible that two more
workers would be reserved even while the existing 2 workers are
running?
/* Provide as many workers as we can. */
*nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
Some review comments on v19-0001 patch:
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
Since we want to release all reserved workers here, I think it's clear
if we use AutoVacuumReleaseAllParallelWorkers() and we add
Assert(av_nworkers_reserved == 0) at the end of
AutoVacuumReleaseAllParallelWorkers(). This way, we can ensure that
all workers are released and it makes the codes more readable. What do
you think?
I've attached the patch proposing this change (please find
v19-0001_masahiko.patch).
---
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
It's odd to me that the comment says it's disabled by default while
being set to 2. I think we can rewrite the comment to:
+#autovacuum_max_parallel_workers = 2 # limited by max_worker_processes
BTW it seems to me that this GUC should be capped by
max_parallel_workers instead of max_worker_processes, no?
---
+ long_desc => 'This parameter is capped by "max_worker_processes"
(not by "autovacuum_max_workers"!).',
I'm concerned that this kind of description might not be appropriate
to the description in long_desc. Looking at long_desc contents of
other GUC parameters, we describe the detail of the parameters (e.g.,
"0 means xxx" or detailed explanation of the effect). Probably we can
remove this line.
>
> > ---
> > I've attached a patch proposing some minor changes.
> >
>
> Thanks! I agree with all fixes except a single one:
> - * NOTE: We will try to provide as many workers as requested, even if caller
> - * will occupy all available workers.
>
> I think that this is a pretty important point. I'll leave this NOTE in the
> v19 patch set. Do you mind?
No, I agree with that.
>
> >
> > + /*
> > + * Number of planned and actually launched parallel workers for all index
> > + * scans, or NULL
> > + */
> > + PVWorkersUsage *workers_usage;
> >
> > I think that LVRelState can have PVWorkersUsage instead of a pointer to it.
> >
>
> Previously I used the NULL value of this pointer as a flag that we don't need
> to log workers usage. Now I'll add boolean flag for this purpose (IIUC,
> "nplanned > 0" condition is not enough to determine whether we should log
> workers usage, because VACUUM PARALLEL can be called without VERBOSE).
Can't we simply not report the worker usage if nplanned is 0?
>
> > ---
> > + /*
> > + * Allocate space for workers usage statistics. Thus, we explicitly
> > + * make clear that such statistics must be accumulated. For now, this
> > + * is used only by autovacuum leader worker, because it must log it in
> > + * the end of table processing.
> > + */
> > + vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
> > + (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
> > + NULL;
> >
> > I think we can report the worker statistics even in VACUUM VERBOSE
> > logs. Currently VACUUM VERBOSE reports the worker usage just during
> > index vacuuming but it would make sense to report the overall
> > statistics in vacuum logs. It would help make VACUUM VERBOSE logs and
> > autovacuum logs consistent.
> >
>
> Agree.
>
> > But we don't need to report the worker usage if we didn't use the
> > parallel vacuum (i.e., if npanned == 0).
> >
>
> As I wrote above - we don't need to log workers usage if the VERBOSE option
> is not specified (even if nplanned > 0). Am I missing something?
No. My point is that even when the VERBOSE option is specified, we can
skip reporting the worker usage if the parallel vacuum is not even
planned. That is, I think we can do like:
if (vacrel->workers_usage.nplanned > 0)
appendStringInfo(&buf,
_("parallel index vacuum/cleanup: %d workers
were planned and %d workers were launched in total\n"),
vacrel->workers_usage.nplanned,
vacrel->workers_usage.nlaunched);
>
> > ---
> > + /* Remember these values, if we asked to. */
> > + if (wusage != NULL)
> > + {
> > + wusage->nlaunched += pvs->pcxt->nworkers_launched;
> > + wusage->nplanned += nworkers;
> > + }
> >
> > This code runs after the attempt to reserve parallel workers.
> > Consequently, if we fail to reserve any workers due to
> > autovacuum_max_parallel_workers, we report the status as if parallel
> > vacuum wasn't planned at all. I think knowing the number of workers
> > that were planned but not reserved would provide valuable insight for
> > users tuning autovacuum_max_parallel_workers.
> >
>
> 100% agree.
Thank you for updating the patch. I think that we need the explanation
of what nlaunched and nplanned actually mean in the PVWorkersUsage
definition:
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
I'm concerned that readers might be confused that nplanned is not the
number of parallel workers we actually planned to launch.
Or it might make sense to track these three values: planned, reserved,
and launched. For example, suppose max_worker_processes = 10 and
autovacuum_max_parallel_workers = 5, if two autovacuum workers try to
reserve 3 workers each, one worker can reserve and launch 3 and
another worker can reserve and launch 2. The autovacuum logs would be
"3 planned and 3 launched" and "3 planned and 2 launched". Users can
deal with the shortage of parallel workers by increasing
autovacuum_max_parallel_workers. On the other hand, if some bgworkers
are being used by other components (.e.g, parallel queries, logical
replication etc.) and there are only 2 free bgworkers, the autovacuum
worker can reserve 3 but can launch only 2, and other worker can
reserve 2 but cannot launch any workers. The autovacuum logs would be
"3 planned and 2 launched" and "3 planned and 0 launched". Here
increasing autovacuum_max_parallel_workers resolves the shortage of
parallel workers, but users would have to increase
max_worker_processes instead. If we can report the worker usage like
"3 planned, 3 reserved, and 2 launched" and "3 planned, 2 reserved,
and 0 launched", users would realize the need to increase
max_worker_processes. Of course, the "xxx reserved" information would
not be necessary for VACUUM VERBOSE logs.
>
> >
> > +typedef enum AVLeaderFaulureType
> > +{
> > + FAIL_NONE,
> > + FAIL_ERROR,
> > + FAIL_FATAL,
> > +} AVLeaderFaulureType;
> >
> > I'm concerned that it is somewhat overwrapped with what injection
> > points does as we can set 'error' to injection_points_attach(). For
> > the FATAL error, we can terminate the autovacuum worker by using
> > pg_terminate_backend() that keeps waiting due to
> > injection_point_attach() with action='wait'.
> >
>
> Oh, I didn't know about the possibility of testing FATAL errors with
> pg_terminate_backend. After reading your letter I found this pattern
> in signal_autovacuum.pl. This is beautiful.
> Thank you, I'll rework these tests.
+1
>
> > ---
> > It would be great if we could test the av_freeParallelWorkers
> > adjustment when max_parallel_maintenance_workers changes.
> >
>
> You mean "when autovacuum_max_parallel_workers changes"?
> I'll add a test for it.
Yes, thanks!
>
> > ---
> > + if (!AmAutoVacuumWorkerProcess())
> > + {
> > + /*
> > + * If we are autovacuum parallel worker, check whether cost-based
> > + * parameters had changed in leader worker.
> > + * If so, vacuum_cost_delay and vacuum_cost_limit will be set to the
> > + * values which leader worker is operating on.
> > + *
> > + * Do it before checking VacuumCostActive, because its value might be
> > + * changed after leader's parameters consumption.
> > + */
> > + parallel_vacuum_fix_cost_based_params();
> > + }
> >
> > We need to add checks to prevent the normal backend running the VACUUM
> > command from calling parallel_vacuum_fix_cost_based_params().
> >
>
> We already have such check inside the "fix_cost_based" function :
> /* Check whether we are running parallel autovacuum */
> if (pv_shared_cost_params == NULL)
> return false;
>
> We also have this comment:
> * If we are autovacuum parallel worker, check whether cost-based
> * parameters had changed in leader worker.
>
> As an alternative, I'll add comment explicitly saying that process will
> immediately return if it not parallel autovacuum participant.
Why don't we add IsInParallelMode() or IsParallelWorker() check before
calling parallel_vacuum_update_shared_delay_params()?
Some review comments on v19-0003 patch:
+bool
+parallel_vacuum_update_shared_delay_params(void)
+{
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return false;
+
+ Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
These codes are a bit odd to me in two points:
1. A process can never be both a parallel worker and an autovacuum worker.
2. If pv_shared_cost_parame == NULL, even autovacuum workers and
non-parallel workers can call this function, but it seems to be
unexpected function call given the subsequent assertion. If we want to
have an assertion to ensure that a function is called only by
processes we expect or allow, I think we should add an assertion to
the beginning of function. How about rewriting these parts to:
Assert(IsParallelWorker());
/* Check whether we are running parallel autovacuum */
if (pv_shared_cost_params == NULL)
return false;
---
+ * Note, that this function has no effect if we are non-autovacuum
+ * parallel worker.
+ */
I don't think this kind of comment should be noted here since if we
change the parallel_vacuum_update_shared_delay_params() behavior in
the future, such comments would get easily out-of-sync.
>
> > + Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
> > +
> > + SpinLockAcquire(&pv_shared_cost_params->spinlock);
> > +
> > + vacuum_cost_delay = pv_shared_cost_params->cost_delay;
> > + vacuum_cost_limit = pv_shared_cost_params->cost_limit;
> > +
> > + SpinLockRelease(&pv_shared_cost_params->spinlock);
> >
> > IIUC autovacuum parallel workers seems to update their
> > vacuum_cost_{delay|limit} every vacuum_delay_point(), which seems not
> > good. Can we somehow avoid unnecessary updates?
>
> More precisely, parallel worker *reads* leader's parameters every delay_point.
> Obviously, this does not mean that the parameters will necessarily be updated.
>
> But I don't see anything wrong with this logic. We just every time get the most
> relevant parameters from the leader. Of course we can introduce some
> signaling mechanism, but it will have the same effect as in the current code.
Although the parameter propagation itself is working correctly, the
current implementation seems suboptimal performance-wise. Acquiring an
additional spinlock and updating the local variables for every block
seems too costly to me. IIUC we would end up incurring these costs
even when vacuum delays are disabled. I think we need to find a better
way.
For example, we can have a generation of these parameters. That is,
the leader increments the generation (stored in PVSharedCostParams)
whenever updating them after reloading the configuration file, and
workers maintain its generation of the parameters currently used. If
the worker's generation < the global generation, it updates its
parameters along with its generation. I think we can implement the
generation using pg_atomic_u32, making the check for parameter updates
lock-free. There might be better ideas, though.
I'll review the patches for regression tests and the documentation.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[application/octet-stream] v19-0001_masahiko.patch (3.6K, 2-v19-0001_masahiko.patch)
download | inline diff:
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index cb42d4e572f..1e35b82aeaf 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -764,7 +764,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Release all the reserved parallel workers for autovacuum */
if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
- AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
+ AutoVacuumReleaseAllParallelWorkers();
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 097b1dd55cf..bdd663610f5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -373,7 +373,6 @@ static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
static void adjust_free_parallel_workers(int prev_max_parallel_workers);
-static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -3391,6 +3390,9 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
*
* nworkers is an in/out parameter; the requested number of parallel workers
* to reserve by the caller, and set to the actual number of reserved workers.
+ *
+ * The caller must call AutoVacuumReleaseParallelWorkers() to release the
+ * reserved workers.
*/
void
AutoVacuumReserveParallelWorkers(int *nworkers)
@@ -3414,11 +3416,12 @@ AutoVacuumReserveParallelWorkers(int *nworkers)
}
/*
- * Leader autovacuum process must call this function in order to update global
- * autovacuum state, so other leaders will be able to use these parallel
- * workers.
+ * Releases the reserved parallel workers for autovacuum.
*
- * 'nworkers' - how many workers caller wants to release.
+ * This function should be used to release the parallel workers that an
+ * autovacuum worker reserved by AutoVacuumReserveParallelWorkers(). nworkers
+ * is the number of workers to release, which must not be greater than the
+ * number of workers currently reserved, av_nworkers_reserved.
*/
void
AutoVacuumReleaseParallelWorkers(int nworkers)
@@ -3426,6 +3429,9 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Only leader worker can call this function. */
Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+ /* Cannot release more workers than reserved */
+ Assert(nworkers <= av_nworkers_reserved);
+
LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
/*
@@ -3443,8 +3449,8 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
}
/*
- * Same as above, but release *all* parallel workers, that were reserved by
- * current leader autovacuum process.
+ * Similar to AutoVacuumReleaseParallelWorkers(), but this function releases
+ * all the parallel workers that this autovacuum worker reserved.
*/
static void
AutoVacuumReleaseAllParallelWorkers(void)
@@ -3454,6 +3460,8 @@ AutoVacuumReleaseAllParallelWorkers(void)
if (av_nworkers_reserved > 0)
AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+
+ Assert(av_nworkers_reserved == 0);
}
/*
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 3f5b59a15bd..f3783afb51b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
/* parallel autovacuum stuff */
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-17 14:52 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Daniil Davydov @ 2026-01-17 14:52 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Sat, Jan 17, 2026 at 5:20 AM Masahiko Sawada <[email protected]> wrote:
>
> Looking at AutoVacuumReserveParallelWorkers(), it seems that we don't
> check the av_maxParallelWorkers() there. Is it possible that two more
> workers would be reserved even while the existing 2 workers are
> running?
>
> /* Provide as many workers as we can. */
> *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
> AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
>
OK, I got it. You are right. I'll use the formula that you mentioned in the
previous letter. I'll also add a test for it.
>
> + /* Release all the reserved parallel workers for autovacuum */
> + if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
> + AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
>
> Since we want to release all reserved workers here, I think it's clear
> if we use AutoVacuumReleaseAllParallelWorkers() and we add
> Assert(av_nworkers_reserved == 0) at the end of
> AutoVacuumReleaseAllParallelWorkers(). This way, we can ensure that
> all workers are released and it makes the codes more readable. What do
> you think?
>
Agree that this will be more clear for the readers.
> I've attached the patch proposing this change (please find
> v19-0001_masahiko.patch).
Thank you, I'll apply this patch. A few things in the patch that I changed :
1)
+ * The caller must call AutoVacuumReleaseParallelWorkers() to release the...
I think that we also should mention AutoVacuumReleaseAllParallelWorkers.
2)
+ * Similar to AutoVacuumReleaseParallelWorkers(), but this function releases...
If you don't mind, I'll leave the "same as above" formulation since this is
typical for the postgres code.
>
> +#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
> + # max_worker_processes
>
> It's odd to me that the comment says it's disabled by default while
> being set to 2. I think we can rewrite the comment to:
>
> +#autovacuum_max_parallel_workers = 2 # limited by max_worker_processes
>
Good catch. I forgot to change this comment.
> BTW it seems to me that this GUC should be capped by
> max_parallel_workers instead of max_worker_processes, no?
>
I explained my point about it here [1] and here [2]. What do you think?
> ---
> + long_desc => 'This parameter is capped by "max_worker_processes"
> (not by "autovacuum_max_workers"!).',
>
> I'm concerned that this kind of description might not be appropriate
> to the description in long_desc. Looking at long_desc contents of
> other GUC parameters, we describe the detail of the parameters (e.g.,
> "0 means xxx" or detailed explanation of the effect). Probably we can
> remove this line.
>
Agree.
> >
> > Previously I used the NULL value of this pointer as a flag that we don't need
> > to log workers usage. Now I'll add boolean flag for this purpose (IIUC,
> > "nplanned > 0" condition is not enough to determine whether we should log
> > workers usage, because VACUUM PARALLEL can be called without VERBOSE).
>
> Can't we simply not report the worker usage if nplanned is 0?
>
> >
> > As I wrote above - we don't need to log workers usage if the VERBOSE option
> > is not specified (even if nplanned > 0). Am I missing something?
>
> No. My point is that even when the VERBOSE option is specified, we can
> skip reporting the worker usage if the parallel vacuum is not even
> planned. That is, I think we can do like:
>
> if (vacrel->workers_usage.nplanned > 0)
> appendStringInfo(&buf,
> _("parallel index vacuum/cleanup: %d workers
> were planned and %d workers were launched in total\n"),
> vacrel->workers_usage.nplanned,
> vacrel->workers_usage.nlaunched);
>
Sorry, I forgot that we accumulate the messages for logging only if
"instrument == true". It will cut off the case when manual vacuum is
called without the VERBOSE option.
>
> Thank you for updating the patch. I think that we need the explanation
> of what nlaunched and nplanned actually mean in the PVWorkersUsage
> definition
>
> +typedef struct PVWorkersUsage
> +{
> + int nlaunched;
> + int nplanned;
> +} PVWorkersUsage;
>
> I'm concerned that readers might be confused that nplanned is not the
> number of parallel workers we actually planned to launch.
>
> Or it might make sense to track these three values: planned, reserved,
> and launched. For example, suppose max_worker_processes = 10 and
> autovacuum_max_parallel_workers = 5, if two autovacuum workers try to
> reserve 3 workers each, one worker can reserve and launch 3 and
> another worker can reserve and launch 2. The autovacuum logs would be
> "3 planned and 3 launched" and "3 planned and 2 launched". Users can
> deal with the shortage of parallel workers by increasing
> autovacuum_max_parallel_workers. On the other hand, if some bgworkers
> are being used by other components (.e.g, parallel queries, logical
> replication etc.) and there are only 2 free bgworkers, the autovacuum
> worker can reserve 3 but can launch only 2, and other worker can
> reserve 2 but cannot launch any workers. The autovacuum logs would be
> "3 planned and 2 launched" and "3 planned and 0 launched". Here
> increasing autovacuum_max_parallel_workers resolves the shortage of
> parallel workers, but users would have to increase
> max_worker_processes instead. If we can report the worker usage like
> "3 planned, 3 reserved, and 2 launched" and "3 planned, 2 reserved,
> and 0 launched", users would realize the need to increase
> max_worker_processes. Of course, the "xxx reserved" information would
> not be necessary for VACUUM VERBOSE logs.
>
Hm, I think that reporting of "nreserved" would make it easier for the user to
understand what is going on. Thanks for the detailed explanation, I'll
implement it.
> >
> > We already have such check inside the "fix_cost_based" function :
> > /* Check whether we are running parallel autovacuum */
> > if (pv_shared_cost_params == NULL)
> > return false;
> >
> > We also have this comment:
> > * If we are autovacuum parallel worker, check whether cost-based
> > * parameters had changed in leader worker.
> >
> > As an alternative, I'll add comment explicitly saying that process will
> > immediately return if it not parallel autovacuum participant.
>
> Why don't we add IsInParallelMode() or IsParallelWorker() check before
> calling parallel_vacuum_update_shared_delay_params()?
Considering your suggestion below, I will add this check.
>
> +bool
> +parallel_vacuum_update_shared_delay_params(void)
> +{
> + /* Check whether we are running parallel autovacuum */
> + if (pv_shared_cost_params == NULL)
> + return false;
> +
> + Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
>
> These codes are a bit odd to me in two points:
>
> 1. A process can never be both a parallel worker and an autovacuum worker.
>
> 2. If pv_shared_cost_parame == NULL, even autovacuum workers and
> non-parallel workers can call this function, but it seems to be
> unexpected function call given the subsequent assertion. If we want to
> have an assertion to ensure that a function is called only by
> processes we expect or allow, I think we should add an assertion to
> the beginning of function. How about rewriting these parts to:
>
> Assert(IsParallelWorker());
>
> /* Check whether we are running parallel autovacuum */
> if (pv_shared_cost_params == NULL)
> return false;
>
Agree, it will be much more clear.
> ---
> + * Note, that this function has no effect if we are non-autovacuum
> + * parallel worker.
> + */
>
> I don't think this kind of comment should be noted here since if we
> change the parallel_vacuum_update_shared_delay_params() behavior in
> the future, such comments would get easily out-of-sync.
>
If behavior will be changed, then all comments for this function will need to
be changed, actually. Don't get me wrong - I just think that this Note is
important for the readers. But if you doubt its usefulness, I don't
mind deleting it.
> > > IIUC autovacuum parallel workers seems to update their
> > > vacuum_cost_{delay|limit} every vacuum_delay_point(), which seems not
> > > good. Can we somehow avoid unnecessary updates?
> >
> > More precisely, parallel worker *reads* leader's parameters every delay_point.
> > Obviously, this does not mean that the parameters will necessarily be updated.
> >
> > But I don't see anything wrong with this logic. We just every time get the most
> > relevant parameters from the leader. Of course we can introduce some
> > signaling mechanism, but it will have the same effect as in the current code.
>
> Although the parameter propagation itself is working correctly, the
> current implementation seems suboptimal performance-wise. Acquiring an
> additional spinlock and updating the local variables for every block
> seems too costly to me. IIUC we would end up incurring these costs
> even when vacuum delays are disabled. I think we need to find a better
> way.
>
> For example, we can have a generation of these parameters. That is,
> the leader increments the generation (stored in PVSharedCostParams)
> whenever updating them after reloading the configuration file, and
> workers maintain its generation of the parameters currently used. If
> the worker's generation < the global generation, it updates its
> parameters along with its generation. I think we can implement the
> generation using pg_atomic_u32, making the check for parameter updates
> lock-free. There might be better ideas, though.
>
OK, I see your point. Considering that we need to check some shared state (in
order to understand whether we should update our params), an atomic variable
seem to be the best solution.
Thank you for the review! Please, see v20 patches. Main changes :
1) Add new formula for freeParallelWorkers computation
2) Add 'nreserved' logging for parallel autovacuum
3) Add atomic variable to speed up checking shared params state change
4) New test for autovacuum_max_parallel_workers parameter change
5) Fully get rid of "custom" injection points in tests
BTW, I think that we need more fixes for documentation, so I'll
take a look at it in the near future.
[1] https://www.postgresql.org/message-id/CAJDiXgiYiX%2BazuR76DcVx8fZn57m_4v6cB14-GW34mWa%3DqudFQ%40mail...
[2] https://www.postgresql.org/message-id/CAJDiXgjX%2BbO%3DdEZxpnsh588N3BsQ%3D7MHX3YQSJS6FxqGq4zMqQ%40ma...
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v20-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 2-v20-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 8312e2a35fd070876f91e7e9555a46e6a71d56e6 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v20 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0fad34da6eb..c64897f4707 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2849,6 +2849,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9284,6 +9285,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 77c5a763d45..3592c9acff9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v20-0004-Tests-for-parallel-autovacuum.patch (23.0K, 3-v20-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 411d69b72072316f1e499a59034b7eeded92ce7e Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v20 4/5] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 60 +++
src/backend/postmaster/autovacuum.c | 19 +
src/include/postmaster/autovacuum.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 28 ++
src/test/modules/test_autovacuum/meson.build | 36 ++
.../modules/test_autovacuum/t/001_basic.pl | 346 ++++++++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 +
.../modules/test_autovacuum/test_autovacuum.c | 41 +++
.../test_autovacuum/test_autovacuum.control | 3 +
12 files changed, 550 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index e3561057334..bb4d2ef3bea 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -280,6 +281,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static void parallel_vacuum_report_cost_based_params(void);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -887,6 +889,14 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -896,6 +906,15 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
parallel_vacuum_process_safe_indexes(pvs);
+ /*
+ * To be able to exercise whether leader parallel autovacuum worker can
+ * propagate cost-based params to parallel workers, wait here until
+ * configuration is changed. I.e. tests are expecting, that during index
+ * processing vacuum_delay_point have been called (if config was changed).
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-after-indexes-processing", NULL);
+
/*
* Next, accumulate buffer and WAL usage. (This must wait for the workers
* to finish, or we might get incomplete data.)
@@ -1261,9 +1280,23 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Prepare to track buffer usage during parallel execution */
InstrStartParallelQuery();
+ INJECTION_POINT("parallel-worker-before-indexes-processing", NULL);
+
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * There is no guarantee that each parallel worker will necessarily
+ * process at least one index. Thus, at this point we cannot be sure that
+ * worker called vacuum_cost_delay. In order to test cost-based parameters
+ * propagation (from leader worker), call vacuum_delay_point here, if
+ * injection point is active.
+ */
+ if (IS_INJECTION_POINT_ATTACHED("parallel-autovacuum-force-delay-point"))
+ parallel_vacuum_report_cost_based_params();
+#endif
+
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
@@ -1316,3 +1349,30 @@ parallel_vacuum_error_callback(void *arg)
return;
}
}
+
+/*
+ * Log values of the related to cost-based delay parameters. It is used for
+ * testing purpose.
+ */
+static void
+parallel_vacuum_report_cost_based_params(void)
+{
+ StringInfoData buf;
+
+ /* Simulate config reload during normal processing */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+ vacuum_delay_point(false);
+ pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+ initStringInfo(&buf);
+
+ appendStringInfo(&buf, "Vacuum cost-based delay parameters of parallel worker:\n");
+ appendStringInfo(&buf,"vacuum_cost_limit = %d\n",vacuum_cost_limit);
+ appendStringInfo(&buf, "vacuum_cost_delay = %g\n", vacuum_cost_delay);
+ appendStringInfo(&buf, "vacuum_cost_page_miss = %d\n", VacuumCostPageMiss);
+ appendStringInfo(&buf, "vacuum_cost_page_dirty = %d\n", VacuumCostPageDirty);
+ appendStringInfo(&buf, "vacuum_cost_page_hit = %d\n", VacuumCostPageHit);
+
+ ereport(LOG, errmsg("%s", buf.data));
+ pfree(buf.data);
+}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3ee858c5fbd..e6d60434bc5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -801,6 +801,8 @@ ProcessAutoVacLauncherInterrupts(void)
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
+
+ INJECTION_POINT("autovacuum-launcher-after-reload-config", NULL);
}
/* Process barrier events */
@@ -3467,6 +3469,23 @@ AutoVacuumReleaseAllParallelWorkers(void)
Assert(av_nworkers_reserved == 0);
}
+/*
+ * Get number of free autovacuum parallel workers.
+ *
+ * For testing purpose only!
+ */
+uint32
+AutoVacuumGetFreeParallelWorkers(void)
+{
+ uint32 nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
+ LWLockRelease(AutovacuumLock);
+
+ return nfree_workers;
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index f3783afb51b..52be260e15f 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
extern void AutoVacuumReleaseAllParallelWorkers(void);
+extern uint32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4c6d56d97d8..bfe365fa575 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1b31c5b98d6..01a3e3ec044 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..32254c53a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..065a58ef2e6
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,346 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ });
+
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ ANALYZE test_autovac;
+ });
+}
+
+sub wait_for_av_log
+{
+ my ($node, $expected_log) = @_;
+
+ $node->wait_for_log($expected_log);
+ truncate $node->logfile, 0 or die "truncate failed: $!";
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_av_log($node,
+ qr/parallel index vacuum\/cleanup: 2 workers were planned, / .
+ qr/2 workers were reserved and 2 workers were launched in total/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 20, 'All parallel workers has been released by the leader');
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-after-indexes-processing', 'wait');
+ SELECT injection_points_attach('parallel-worker-before-indexes-processing', 'wait');
+ SELECT injection_points_attach('parallel-autovacuum-force-delay-point', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum leader launches parallel worker and falls
+# asleep on the injection point
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until leader worker is guaranteed to update parameters and propagate
+# their values to the parallel worker
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-after-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-after-indexes-processing');
+});
+
+# Now wake up the parallel worker and force it to call vacuum_delay_point
+$node->wait_for_event(
+ 'parallel worker',
+ 'parallel-worker-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('parallel-worker-before-indexes-processing');
+});
+
+# Check whether worker successfully updated all parameters
+wait_for_av_log($node,
+ qr/Vacuum cost-based delay parameters of parallel worker:\n/ .
+ qr/\tvacuum_cost_limit = 500\n/ .
+ qr/\tvacuum_cost_delay = 2\n/ .
+ qr/\tvacuum_cost_page_miss = 10\n/ .
+ qr/\tvacuum_cost_page_dirty = 10\n/ .
+ qr/\tvacuum_cost_page_hit = 10\n/);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ SELECT injection_points_detach('autovacuum-leader-after-indexes-processing');
+ SELECT injection_points_detach('parallel-worker-before-indexes-processing');
+ SELECT injection_points_detach('parallel-autovacuum-force-delay-point');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# Test 3:
+# Test adjustment of free parallel workers number when changing
+# autovacuum_max_parallel_workers parameter
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ SELECT injection_points_attach('autovacuum-launcher-after-reload-config', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 1;
+ SELECT pg_reload_conf();
+});
+
+$node->wait_for_event(
+ 'autovacuum launcher',
+ 'autovacuum-launcher-after-reload-config'
+);
+
+# Since 2 parallel workers already launched and will be released in the future,
+# we are expecting that :
+# 1) number of free workers will be '0' after config reload
+# 2) number of free workers will be '1' after releasing workers
+
+# Check statement (1)
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 0,
+ 'Number of free parallel workers is consistent');
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-launcher-after-reload-config');
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until the end of parallel processing
+wait_for_av_log($node,
+ qr/parallel index vacuum\/cleanup: 2 workers were planned, / .
+ qr/2 workers were reserved and 2 workers were launched in total/);
+
+# Check statement (2)
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 1,
+ 'Number of free parallel workers is consistent');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ SELECT injection_points_detach('autovacuum-launcher-after-reload-config');
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
+ SELECT pg_reload_conf();
+});
+
+# Test 4:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'error');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+wait_for_av_log($node,
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 5:
+# Same as above test, but simulate situation, when leader exites due to FATAL.
+
+prepare_for_next_test($node, 5);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+my $av_pid = $node->safe_psql('postgres', qq{
+ SELECT pid FROM pg_stat_activity
+ WHERE backend_type = 'autovacuum worker'
+ AND wait_event = 'autovacuum-leader-before-indexes-processing'
+ LIMIT 1;
+});
+
+# Create role with pg_signal_autovacuum_worker for terminating autovacuum worker.
+$node->safe_psql('postgres', qq{
+ CREATE ROLE regress_worker_role;
+ GRANT pg_signal_autovacuum_worker TO regress_worker_role;
+ SET ROLE regress_worker_role;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
+
+wait_for_av_log($node,
+ qr/terminating autovacuum process due to administrator command/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..e5646e0def5
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting shared autovacuum state
+ */
+
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..959629c7685
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,41 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "commands/vacuum.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_UINT32(nfree_workers);
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v20-0001-Parallel-autovacuum.patch (19.6K, 4-v20-0001-Parallel-autovacuum.patch)
download | inline diff:
From bbb5df9645dde72fcbb45f7aef141008e03226fa Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v20 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 39 +++-
src/backend/postmaster/autovacuum.c | 172 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 244 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0b83f98ed5f..692ac46733e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c3b3c9ea21a..1e35b82aeaf 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched < nworkers)
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +761,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseAllParallelWorkers();
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 22379de1e31..ddd7bbdf520 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,13 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +292,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -361,6 +372,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -760,6 +772,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -776,6 +790,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1380,6 +1403,16 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -2277,6 +2310,12 @@ do_autovacuum(void)
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that all
+ * reserved workers are released even after FATAL error.
+ */
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Perform operations on collected tables.
*/
@@ -2458,6 +2497,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2858,8 +2903,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3336,6 +3385,88 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Reserves parallel workers for autovacuum.
+ *
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
+ *
+ * The caller must call AutoVacuumRelease[All]ParallelWorkers() to release the
+ * reserved workers.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader autovacuum worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* The worker must not have any reserved workers yet */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+}
+
+/*
+ * Releases the reserved parallel workers for autovacuum.
+ *
+ * This function should be used to release the parallel workers that an
+ * autovacuum worker reserved by AutoVacuumReserveParallelWorkers(). nworkers
+ * is the number of workers to release, which must not be greater than the
+ * number of workers currently reserved, av_nworkers_reserved.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Cannot release more workers than reserved */
+ Assert(nworkers <= av_nworkers_reserved);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+}
+
+/*
+ * Same as above, but this function releases all the parallel workers that
+ * this autovacuum worker reserved.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3396,6 +3527,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3477,3 +3612,36 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ int nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+
+ /*
+ * Don't allow number of free workers to become less than zero if the
+ * patameter was decreased.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Max(AutoVacuumShmem->av_freeParallelWorkers, 0);
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae9d5f3fb70..c8a99a67767 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 7c60b125564..492f66a6872 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..7b536e81791 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8b91bc00062..ed59a21289c 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index db559b39c4d..ad6e19f426c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..f3783afb51b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,6 +62,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d03ab247788..c1d882659f9 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v20-0003-Cost-based-parameters-propagation-for-parallel-a.patch (8.7K, 5-v20-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From aa86e425e5f7a4c8247b8f6d39d852de36e37456 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v20 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 29 +++++-
src/backend/commands/vacuumparallel.c | 134 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
4 files changed, 164 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index aa4fbec143f..3cc84bedfb7 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2430,8 +2430,27 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
+ {
+ /*
+ * If we are parallel *autovacuum* worker, check whether related to
+ * cost-based delay parameters had changed in the leader worker. If
+ * so, corresponding parameters will be updated to the values which
+ * leader worker is operating on.
+ *
+ * Do it before checking VacuumCostActive, because its value might be
+ * changed after leader's parameters consumption.
+ *
+ * Note, that this function has no effect if we are non-autovacuum
+ * parallel worker.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2445,6 +2464,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * If we are parallel autovacuum leader and some of cost-based
+ * parameters had changed, let other parallel workers know.
+ */
+ parallel_vacuum_propagate_cost_based_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 44c1258b69e..e3561057334 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -53,6 +53,39 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Only autovacuum leader can reload config file. We use this structure in
+ * parallel autovacuum for keeping worker's parameters in sync with leader's
+ * parameters.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * Each time leader worker updates its parameters, it must increase
+ * generation. Every parallel worker keeps the generation
+ * (shared_params_local_generation) at which it had last time received
+ * parameters from the leader.
+ *
+ * It is enough for worker to compare it's local_generation with the field
+ * below to determine whether it needs to receive new parameters' values.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t spinlock; /* protects all fields below */
+
+ /* Copies of corresponding parameters from autovacuum leader process */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
+static PVSharedCostParams * pv_shared_cost_params = NULL;
+
+/* See comments for structure above for the explanation. */
+static uint32 shared_params_generation_local = 0;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -122,6 +155,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool am_parallel_autovacuum;
+
+ /*
+ * Struct for syncing parameters between supportive parallel autovacuum
+ * workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -395,6 +440,20 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->am_parallel_autovacuum = AmAutoVacuumWorkerProcess();
+
+ if (shared->am_parallel_autovacuum)
+ {
+ shared->cost_params.cost_delay = vacuum_cost_delay;
+ shared->cost_params.cost_limit = vacuum_cost_limit;
+ shared->cost_params.cost_page_dirty = VacuumCostPageDirty;
+ shared->cost_params.cost_page_hit = VacuumCostPageHit;
+ shared->cost_params.cost_page_miss = VacuumCostPageMiss;
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.spinlock);
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -537,6 +596,78 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
+/*
+ * Function to be called from parallel autovacuum worker in order to sync
+ * some cost-based delay parameter with the leader worker.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Function to be called from parallel autovacuum leader in order to propagate
+ * some cost-based parameters to the supportive workers.
+ */
+void
+parallel_vacuum_propagate_cost_based_params(void)
+{
+ uint32 params_generation;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ pv_shared_cost_params->cost_delay = vacuum_cost_delay;
+ pv_shared_cost_params->cost_limit = vacuum_cost_limit;
+ pv_shared_cost_params->cost_page_dirty = VacuumCostPageDirty;
+ pv_shared_cost_params->cost_page_hit = VacuumCostPageHit;
+ pv_shared_cost_params->cost_page_miss = VacuumCostPageMiss;
+
+ /*
+ * Increase generation of the parameters, i.e. let parallel workers know
+ * that they should re-read shared cost params.
+ */
+ pg_atomic_write_u32(&pv_shared_cost_params->generation,
+ params_generation + 1);
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1100,6 +1231,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->am_parallel_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ddd7bbdf520..3ee858c5fbd 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1692,7 +1692,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7cbb59d124f..712e1254613 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -414,6 +414,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_cost_based_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
--
2.43.0
[text/x-patch] v20-0002-Logging-for-parallel-autovacuum.patch (8.8K, 6-v20-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 5d3a26a6441f6358505e853d15634e829ac06b05 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v20 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 33 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 27 +++++++++++++++++-----
src/include/commands/vacuum.h | 19 +++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 70 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1fcb212ab3d..8d6998c6f6f 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -781,6 +787,9 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->vm_new_visible_frozen_pages = 0;
vacrel->vm_new_frozen_pages = 0;
+ vacrel->workers_usage.nlaunched = 0;
+ vacrel->workers_usage.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1132,24 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.nplanned > 0 &&
+ AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum */
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup: %d workers were planned, %d workers were reserved and %d workers were launched in total\n"),
+ vacrel->workers_usage.nplanned,
+ vacrel->workers_usage.nreserved,
+ vacrel->workers_usage.nlaunched);
+ }
+ else if (vacrel->workers_usage.nplanned > 0)
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL) */
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup: %d workers were planned and %d workers were launched in total\n"),
+ vacrel->workers_usage.nplanned,
+ vacrel->workers_usage.nlaunched);
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2698,7 +2725,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3131,7 +3159,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 1e35b82aeaf..44c1258b69e 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -655,13 +656,23 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wusage != NULL && nworkers > 0)
+ wusage->nplanned += nworkers;
+
/*
* Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
*/
if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ {
AutoVacuumReserveParallelWorkers(&nworkers);
+ /* Remember this value, if we asked to */
+ if (wusage != NULL)
+ wusage->nreserved += nworkers;
+ }
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -725,6 +736,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wusage != NULL)
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..7cbb59d124f 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,19 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched, reserved
+ * and planned workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nplanned; /* # of parallel workers we are planned to
+ * launch */
+ int nreserved; /* for autovacuum only - # of parallel workers
+ * we have managed to reserve */
+ int nlaunched; /* # of launched parallel workers */
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +407,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3f3a888fd0e..afebde72235 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2404,6 +2404,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-21 22:22 Sami Imseih <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 0 replies; 112+ messages in thread
From: Sami Imseih @ 2026-01-21 22:22 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
I took a look at v20-0001,0002 and 0003 and have some comments.
v20-0001:
1/
```
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+
+ /*
+ * Don't allow number of free workers to become less than zero if the
+ * patameter was decreased.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Max(AutoVacuumShmem->av_freeParallelWorkers, 0);
```
This can probably be simplified to:
```
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
```
v20-0002:
1/
I don't think showing "reserved" in the logging is needed and could be
confusing.
```
parallel index vacuum/cleanup: 3 workers were planned, 3 workers were
reserved and 3 workers were launched in total
```
Also, even though the table has `autovacuum_parallel_workers = 2`, I
see 3 workers.
This is because one worker was for cleanup due to a gin index on the
table. I think it's better
to show separate lines for index vacuuming and index cleanup, just
like VACUUM VERBOSE.
```
INFO: launched 2 parallel vacuum workers for index vacuuming (planned: 2)
INFO: launched 1 parallel vacuum worker for index cleanup (planned: 1)
```
otherwise it will lead the user to think 3 workers were allocated for
either vacuuming or cleanup.
v20-0003:
1/
inside vacuum_delay_point, I would re-organize the checks to
first run the code block for the a/v worker:
```
if (ConfigReloadPending && AmAutoVacuumWorkerProcess())
```
then the a/v/ parallel worker:
```
if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
```
But I am also wondering if we should have a specific backend_type
for "autovacuum parallel worker" to differentiate that from the
existing "autovacuum worker".
and also we can have a helper macro like:
```
#define AmAutoVacuumParallelWorkerProcess() (MyBackendType ==
B_AUTOVAC_PARALLEL_WORKER)
```
What do you think?
2/
Add
```
+typedef struct PVSharedCostParams
````
to src/tools/pgindent/typedefs.list
3/
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.spinlock);
+ pv_shared_cost_params = &(shared->cost_params);
NIT: move SpinLockInit last
4/
Instead of:
```
+ params_generation =
pg_atomic_read_u32(&pv_shared_cost_params->generation);
+
```
and then later on:
````
+ /*
+ * Increase generation of the parameters, i.e. let parallel workers know
+ * that they should re-read shared cost params.
+ */
+ pg_atomic_write_u32(&pv_shared_cost_params->generation,
+ params_generation + 1);
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
```
why can't we just do:
pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
Also, do the pg_atomic_fetch_add_u32 outside of the spinlock. right?
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-01-21 22:28 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-01-21 22:28 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sat, Jan 17, 2026 at 6:52 AM Daniil Davydov <[email protected]> wrote:
>
>
>
> > I've attached the patch proposing this change (please find
> > v19-0001_masahiko.patch).
>
> Thank you, I'll apply this patch. A few things in the patch that I changed :
> 1)
> + * The caller must call AutoVacuumReleaseParallelWorkers() to release the...
> I think that we also should mention AutoVacuumReleaseAllParallelWorkers.
> 2)
> + * Similar to AutoVacuumReleaseParallelWorkers(), but this function releases...
> If you don't mind, I'll leave the "same as above" formulation since this is
> typical for the postgres code.
>
Agreed.
>
> > BTW it seems to me that this GUC should be capped by
> > max_parallel_workers instead of max_worker_processes, no?
> >
>
> I explained my point about it here [1] and here [2]. What do you think?
I agree that autovacuum_max_parallel_workers should not be capped by
other GUC parametres when setting a value. However, these messages
seem not explain why this parameter is limited by max_worker_processes
instead of max_parallel_workers. You mentioned:
> I will keep the 'max_worker_processes' limit, so autovacuum will not
> waste time initializing a parallel context if there is no chance that
> the request will succeed.
> But it's worth remembering that actually the
> 'autovacuum_max_parallel_workers' parameter will always be implicitly
> capped by 'max_parallel_workers'.
It doesn't make sense to me that we limit
autovacuum_max_parallel_workers by max_worker_processes TBH. When
users want to have more parallel vacuum workers for autovacuum and the
VACUUM command, they would have to consider max_worker_processes,
max_parallel_workers, and max_parallel_maintenance_workers separately.
Given that max_parallel_workers is controlling the number of
max_worker_processes that can be used in parallel operations, I
believe that parallel vacuum workers for autovacuum should also be
taken from that pool.
>
> > ---
> > + * Note, that this function has no effect if we are non-autovacuum
> > + * parallel worker.
> > + */
> >
> > I don't think this kind of comment should be noted here since if we
> > change the parallel_vacuum_update_shared_delay_params() behavior in
> > the future, such comments would get easily out-of-sync.
> >
>
> If behavior will be changed, then all comments for this function will need to
> be changed, actually. Don't get me wrong - I just think that this Note is
> important for the readers. But if you doubt its usefulness, I don't
> mind deleting it.
I still could not figure out why it should be mentioned here instead
of at the comment of parallel_vacuum_update_shared_delay_params().
Readers can notice that calling
parallel_vacuum_update_shared_delay_params() for parallel vacuum
worker for the VACUUM command has no effect when reading the function.
In my opinion, we should mention here why we call
parallel_vacuum_update_shared_delay_params() but should not mention
what the called function does because it should have been described in
that function.
BTW can we expose pv_shared_cost_params so that we can check it in
vacuum_delay_point() before trying to call
parallel_vacuum_update_shared_delay_params()?
>
> > > > IIUC autovacuum parallel workers seems to update their
> > > > vacuum_cost_{delay|limit} every vacuum_delay_point(), which seems not
> > > > good. Can we somehow avoid unnecessary updates?
> > >
> > > More precisely, parallel worker *reads* leader's parameters every delay_point.
> > > Obviously, this does not mean that the parameters will necessarily be updated.
> > >
> > > But I don't see anything wrong with this logic. We just every time get the most
> > > relevant parameters from the leader. Of course we can introduce some
> > > signaling mechanism, but it will have the same effect as in the current code.
> >
> > Although the parameter propagation itself is working correctly, the
> > current implementation seems suboptimal performance-wise. Acquiring an
> > additional spinlock and updating the local variables for every block
> > seems too costly to me. IIUC we would end up incurring these costs
> > even when vacuum delays are disabled. I think we need to find a better
> > way.
> >
> > For example, we can have a generation of these parameters. That is,
> > the leader increments the generation (stored in PVSharedCostParams)
> > whenever updating them after reloading the configuration file, and
> > workers maintain its generation of the parameters currently used. If
> > the worker's generation < the global generation, it updates its
> > parameters along with its generation. I think we can implement the
> > generation using pg_atomic_u32, making the check for parameter updates
> > lock-free. There might be better ideas, though.
> >
>
> OK, I see your point. Considering that we need to check some shared state (in
> order to understand whether we should update our params), an atomic variable
> seem to be the best solution.
>
>
> Thank you for the review! Please, see v20 patches. Main changes :
> 1) Add new formula for freeParallelWorkers computation
> 2) Add 'nreserved' logging for parallel autovacuum
> 3) Add atomic variable to speed up checking shared params state change
> 4) New test for autovacuum_max_parallel_workers parameter change
> 5) Fully get rid of "custom" injection points in tests
>
The 0001 patch looks mostly good to me except for the above comment
(max_worker_processes vs. max_parallel_workers) and the following
point:
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+
+ /*
+ * Don't allow number of free workers to become less than zero if the
+ * patameter was decreased.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Max(AutoVacuumShmem->av_freeParallelWorkers, 0);
Why does it do Max(x, 0) twice?
* 0002 patch:
+ if (vacrel->workers_usage.nplanned > 0 &&
+ AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum */
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup: %d
workers were planned, %d workers were reserved and %d workers were
launched in total\n"),
+ vacrel->workers_usage.nplanned,
+ vacrel->workers_usage.nreserved,
+ vacrel->workers_usage.nlaunched);
+ }
+ else if (vacrel->workers_usage.nplanned > 0)
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL) */
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup: %d
workers were planned and %d workers were launched in total\n"),
+ vacrel->workers_usage.nplanned,
+ vacrel->workers_usage.nlaunched);
+ }
Can we refactoring these codes to:
if (vacrel->workers_usage.nplanned > 0)
{
if (AmAutoVacuumWorkerProcess())
appendStringInfo(...);
else
appendStringInfo(...);
* 0003 patch:
+ if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
+ {
We can just check IsParallelWorker() here.
---
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_cost_based_params(void);
I think it's better to have similar names to these functions for
consistency and readability. How about the following?
parallel_vacuum_update_delay_params();
parallel_vacuum_propagate_delay_params();
---
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ pv_shared_cost_params->cost_delay = vacuum_cost_delay;
+ pv_shared_cost_params->cost_limit = vacuum_cost_limit;
+ pv_shared_cost_params->cost_page_dirty = VacuumCostPageDirty;
+ pv_shared_cost_params->cost_page_hit = VacuumCostPageHit;
+ pv_shared_cost_params->cost_page_miss = VacuumCostPageMiss;
I think we can check if the new cost-based delay parameters are really
changed before changing the shared values. If users don't change
cost-based delay parameters, we don't need to increment the generation
at all.
---
+ pg_atomic_write_u32(&pv_shared_cost_params->generation,
+ params_generation + 1);
We can use pg_atomic_add_fetch_u32() instead.
---
+/*
+ * Only autovacuum leader can reload config file. We use this structure in
+ * parallel autovacuum for keeping worker's parameters in sync with leader's
+ * parameters.
+ */
+typedef struct PVSharedCostParams
I'd suggest writing the overall description first (e.g., what the
struct holds and what the function does etc), and then describing the
details and notes etc. For instance, readers might be confused when
reading the first sentence "Only autovacuum leader can reload config
file" as the struct definition is not related to the autovacuum
implementation fact that autovacuum workers can reload the config file
during the work. We would need to mention such detail somewhere in the
comments but I think it should not be the first sentence. How about
rewriting it to something like:
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
---
+ slock_t spinlock; /* protects all fields below */
It's convention to name 'mutex' as a field name.
---
+static PVSharedCostParams * pv_shared_cost_params = NULL;
+
+/* See comments for structure above for the explanation. */
+static uint32 shared_params_generation_local = 0;
I think it's preferable to move these definitions of static variables
right before the function prototypes.
---
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool am_parallel_autovacuum;
How about renaming it to use_shared_delay_params? I think it conveys
better what the field is used for.
* 0004 patch:
The patch introduces 5 injection points, which seems overkill to me
for implementing the tests. IIUC we can implement the test2 with two
injection points: 'autovacuum-start-parallel-vacuum' (set right before
lazy_scan_heap() call) and
'autovacuum-leader-before-indexes-processing'.
1. stop the autovacuum worker at 'autovacuum-start-parallel-vacuum'.
2. change delay params and reload the conf.
3. let the autovacuum worker process tables (vacuum_delay_point() is
called during the heap scan).
4. stop the autovacuum worker at 'autovacuum-leader-before-indexes-processing'.
5. let parallel workers process indexes (vacuum_delay_point() is
called during index vacuuming).
For test3, I think we can write a DEBUG2 log in
adjust_free_parallel_workers() and check it during the test instead of
introducing the test-only function.
For test4 and test5, we check the number of free workers using
get_parallel_autovacuum_free_workers(). However, since autovacuum
could retry to vacuum the table again, the test could fail.
And here are some general comments and suggestions:
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
We need comments to explain what we test with this test file.
---
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ ANALYZE test_autovac;
+ });
Why do we need to execute ANALYZE as well?
---
+ $node->wait_for_log($expected_log);
+ truncate $node->logfile, 0 or die "truncate failed: $!";
+}
Truncating all logs every after test would decrease the debuggability
much. We can pass the offset as the start point to wait for the
contents.
---
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
We can use generate_series() here. And it's faster to load the data
and then create indexes.
---
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
Please use pgsql_safe() instead.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-02-10 15:03 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-02-10 15:03 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
Thanks everyone for the review!
**Comments on the 0001 patch**
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
>
> + AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
> +
> + /*
> + * Don't allow number of free workers to become less than zero if the
> + * patameter was decreased.
> + */
> + AutoVacuumShmem->av_freeParallelWorkers =
> + Max(AutoVacuumShmem->av_freeParallelWorkers, 0);
> ```
>
> This can probably be simplified to:
>
> ```
> AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
> ```
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> + /*
> + * Cap or increase number of free parallel workers according to the
> + * parameter change.
> + */
> + AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
> +
> + /*
> + * Don't allow number of free workers to become less than zero if the
> + * patameter was decreased.
> + */
> + AutoVacuumShmem->av_freeParallelWorkers =
> + Max(AutoVacuumShmem->av_freeParallelWorkers, 0);
>
> Why does it do Max(x, 0) twice?
Agreed, I missed this one. Surely it can be simplified.
--
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> On Sat, Jan 17, 2026 at 6:52 AM Daniil Davydov <[email protected]> wrote:
> >
> > I will keep the 'max_worker_processes' limit, so autovacuum will not
> > waste time initializing a parallel context if there is no chance that
> > the request will succeed.
> > But it's worth remembering that actually the
> > 'autovacuum_max_parallel_workers' parameter will always be implicitly
> > capped by 'max_parallel_workers'.
>
> It doesn't make sense to me that we limit
> autovacuum_max_parallel_workers by max_worker_processes TBH. When
> users want to have more parallel vacuum workers for autovacuum and the
> VACUUM command, they would have to consider max_worker_processes,
> max_parallel_workers, and max_parallel_maintenance_workers separately.
> Given that max_parallel_workers is controlling the number of
> max_worker_processes that can be used in parallel operations, I
> believe that parallel vacuum workers for autovacuum should also be
> taken from that pool.
Maybe I don't quite understand the meaning of "limited by". For example,
we have a max_parallel_workers_per_gather parameter, which is limited
by max_parallel_workers. But actually we can set this parameter to a value
that is higher than max_parallel_workers. The limitation is that for Gather
node we cannot request more workers than are available in bgworkers pool
(where number of free workers is always <= max_parallel_workers). Thus,
limitation actually exists only for bgworkers pool, on which other parallel
operations depend. In particular, whatever values we set for the
autovacuum_max_parallel_workers parameter, it always will depend only
on bgworkers pool.
I'll give in to your opinion and add a limitation by max_parallel_workers.
But I still don't understand where the point is in explicit limitation by
max_parallel_workers, if we already have this limitation implicitly?
It seems a bit redundant for me. I hope I've conveyed my point correctly.
**Comments on the 0002 patch**
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
> I don't think showing "reserved" in the logging is needed and could be
> confusing.
>
The rationale for this is in the previous letter of Masahiko-san, and I
agree with it. Do you think it can be confusing because users
aren't familiar with the "reserved workers" in terms of postgres?
I think that we can write about it in documentation, so users will
be ready for it.
--
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
> I think it's better
> to show separate lines for index vacuuming and index cleanup, just
> like VACUUM VERBOSE.
>
> ```
> INFO: launched 2 parallel vacuum workers for index vacuuming (planned: 2)
> INFO: launched 1 parallel vacuum worker for index cleanup (planned: 1)
> ```
>
Actually, we already have such a logging (see
parallel_vacuum_process_all_indexes function) for both VACUUM
PARALLEL and parallel autovacuum. I think that in addition we can
split the final log message (with total parallel vacuum stats) into two
lines : for vacuum and cleanup respectively. Please, see these changes
in the 0002 patch.
--
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> Can we refactoring these codes to:
>
> if (vacrel->workers_usage.nplanned > 0)вв
> {
> if (AmAutoVacuumWorkerProcess())
> appendStringInfo(...);
> else
> appendStringInfo(...);
I agree.
**Comments on the 0003 patch**
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> On Sat, Jan 17, 2026 at 6:52 AM Daniil Davydov <[email protected]> wrote:
> >
> > If behavior will be changed, then all comments for this function will need to
> > be changed, actually. Don't get me wrong - I just think that this Note is
> > important for the readers. But if you doubt its usefulness, I don't
> > mind deleting it.
>
> I still could not figure out why it should be mentioned here instead
> of at the comment of parallel_vacuum_update_shared_delay_params().
> Readers can notice that calling
> parallel_vacuum_update_shared_delay_params() for parallel vacuum
> worker for the VACUUM command has no effect when reading the function.
> In my opinion, we should mention here why we call
> parallel_vacuum_update_shared_delay_params() but should not mention
> what the called function does because it should have been described in
> that function.
>
OK, I agree.
> BTW can we expose pv_shared_cost_params so that we can check it in
> vacuum_delay_point() before trying to call
> parallel_vacuum_update_shared_delay_params()?
>
I would prefer not to do so. IMO it would be better if we'll encapsulate the
shared delay parameters logic inside a single file.
> + if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
> + {
>
> We can just check IsParallelWorker() here.
I agree.
--
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> +extern void parallel_vacuum_update_shared_delay_params(void);
> +extern void parallel_vacuum_propagate_cost_based_params(void);
>
> I think it's better to have similar names to these functions for
> consistency and readability. How about the following?
>
> parallel_vacuum_update_delay_params();
> parallel_vacuum_propagate_delay_params();
>
Yep, 100% agree - I just forgot to do it. if you don't mind, I would leave
the word "shared" in the function names.
--
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> + params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
> +
> + SpinLockAcquire(&pv_shared_cost_params->spinlock);
> +
> + pv_shared_cost_params->cost_delay = vacuum_cost_delay;
> + pv_shared_cost_params->cost_limit = vacuum_cost_limit;
> + pv_shared_cost_params->cost_page_dirty = VacuumCostPageDirty;
> + pv_shared_cost_params->cost_page_hit = VacuumCostPageHit;
> + pv_shared_cost_params->cost_page_miss = VacuumCostPageMiss;
>
> I think we can check if the new cost-based delay parameters are really
> changed before changing the shared values. If users don't change
> cost-based delay parameters, we don't need to increment the generation
> at all.
>
I agree.
--
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> +/*
> + * Only autovacuum leader can reload config file. We use this structure in
> + * parallel autovacuum for keeping worker's parameters in sync with leader's
> + * parameters.
> + */
> +typedef struct PVSharedCostParams
>
> I'd suggest writing the overall description first (e.g., what the
> struct holds and what the function does etc), and then describing the
> details and notes etc. For instance, readers might be confused when
> reading the first sentence "Only autovacuum leader can reload config
> file" as the struct definition is not related to the autovacuum
> implementation fact that autovacuum workers can reload the config file
> during the work. We would need to mention such detail somewhere in the
> comments but I think it should not be the first sentence. How about
> rewriting it to something like:
>
> +/*
> + * Struct for cost-based vacuum delay related parameters to share among an
> + * autovacuum worker and its parallel vacuum workers.
> + */
>
Yep, you are right.
> + slock_t spinlock; /* protects all fields below */
>
> It's convention to name 'mutex' as a field name.
>
OK.
--
> +static PVSharedCostParams * pv_shared_cost_params = NULL;
> +
> +/* See comments for structure above for the explanation. */
> +static uint32 shared_params_generation_local = 0;
>
> I think it's preferable to move these definitions of static variables
> right before the function prototypes.
>
I agree.
--
> + /*
> + * If 'true' then we are running parallel autovacuum. Otherwise, we are
> + * running parallel maintenence VACUUM.
> + */
> + bool am_parallel_autovacuum;
>
> How about renaming it to use_shared_delay_params? I think it conveys
> better what the field is used for.
I think that we should leave this name, because in the future some other
behavior differences may occur between manual VACUUM and autovacuum.
If so, we will already have an "am_autovacuum" field which we can use in
the code.
The existing logic with the "am_autovacuum" name is also LGTM - we should
use shared delay params only because we are running parallel autovacuum.
--
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
> inside vacuum_delay_point, I would re-organize the checks to
> first run the code block for the a/v worker:
>
> ```
> if (ConfigReloadPending && AmAutoVacuumWorkerProcess())
> ```
>
> then the a/v/ parallel worker:
>
> ```
> if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
> ```
>
Besides ConfigReloadPending we also must check VacuumCostActive.
I placed the call of update_shared_delay_params function *before* checking
VacuumCostActive, because parallel worker can change value of this variable
inside of this function. Also we should call functions related to a/v worker
only *after* checking the VacuumCostActive. Thus, the parallel a/v worker
logic should be called before leader a/v worker logic.
Am I missing something?
--
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
> But I am also wondering if we should have a specific backend_type
> for "autovacuum parallel worker" to differentiate that from the
> existing "autovacuum worker".
>
> and also we can have a helper macro like:
> ```
> #define AmAutoVacuumParallelWorkerProcess() (MyBackendType ==
> B_AUTOVAC_PARALLEL_WORKER)
> ```
>
> What do you think?
>
I don't think that we should do it, because the workers (that are launched
by a/v worker) are technically no different from other bgworkers, that are
launched for other purposes. Since we easily can distinguish a/v parallel
worker from others, I suggest we leave it as it is.
--
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
> Add
> ```
> +typedef struct PVSharedCostParams
> ````
>
> to src/tools/pgindent/typedefs.list
>
I agree. I'll also add all new structures to the typedefs.list
--
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
>
> + pg_atomic_init_u32(&shared->cost_params.generation, 0);
> + SpinLockInit(&shared->cost_params.spinlock);
> + pv_shared_cost_params = &(shared->cost_params);
>
> NIT: move SpinLockInit last
I think that we should init the pointer to the shared->cost_params when
all of this structure's fields are initialized. Thus, I guess that SpinLockInit
should be placed before the "pv_shared_cost_params = ...".
Here it doesn't actually make any difference where to place it, but I think
It's a little more beautiful.
--
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
>
> Instead of:
>
> ```
> + params_generation =
> pg_atomic_read_u32(&pv_shared_cost_params->generation);
> +
> ```
> and then later on:
> ````
> + /*
> + * Increase generation of the parameters, i.e. let parallel workers know
> + * that they should re-read shared cost params.
> + */
> + pg_atomic_write_u32(&pv_shared_cost_params->generation,
> + params_generation + 1);
> +
> + SpinLockRelease(&pv_shared_cost_params->spinlock);
> ```
>
> why can't we just do:
>
> pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
>
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> + pg_atomic_write_u32(&pv_shared_cost_params->generation,
> + params_generation + 1);
>
> We can use pg_atomic_add_fetch_u32() instead.
>
Yep, agreed.
--
On Thu, Jan 22, 2026 at 5:22 AM Sami Imseih <[email protected]> wrote:
>
> Also, do the pg_atomic_fetch_add_u32 outside of the spinlock. right?
>
Sure. Somehow I missed it.
**Comments on the 0004 patch**
On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
>
> The patch introduces 5 injection points, which seems overkill to me
> for implementing the tests. IIUC we can implement the test2 with two
> injection points: 'autovacuum-start-parallel-vacuum' (set right before
> lazy_scan_heap() call) and
> 'autovacuum-leader-before-indexes-processing'.
>
> 1. stop the autovacuum worker at 'autovacuum-start-parallel-vacuum'.
> 2. change delay params and reload the conf.
> 3. let the autovacuum worker process tables (vacuum_delay_point() is
> called during the heap scan).
> 4. stop the autovacuum worker at 'autovacuum-leader-before-indexes-processing'.
> 5. let parallel workers process indexes (vacuum_delay_point() is
> called during index vacuuming).
OK, I'll do it.
--
> For test3, I think we can write a DEBUG2 log in
> adjust_free_parallel_workers() and check it during the test instead of
> introducing the test-only function.
>
> Truncating all logs every after test would decrease the debuggability
> much. We can pass the offset as the start point to wait for the
> contents.
>
I've combined two of your above comments purposely. I agree that truncating
all logs is a bad decision and we need to solve this in a different way. But the
problem will occur If we want to 1) use logging instead of a test-only function
and 2) use offsets as the start point to wait for the contents in the logfile.
Imagine that we (using the described approach) need to wait until the end of
parallel index processing and determine the current number of free parallel
workers.
IIUC, It'll look like this :
wait_for_av_log("autovacuum processing finished");
wait_for_av_log("number of free workers = N");
But when we call wait_for_av_log first time, we will advance "offset" to the
end of logfile and thus we will miss the "number of free workers = N". The
only way to avoid it is to write a function that will determine the exact
position of "autovacuum processing finished" in the logfile. Isn't it too much?
I think that using wait_for_av_log("autovacuum processing finished"); +
SELECT get_parallel_autovacuum_free_workers(); will be much more
demonstrably and simply.
Moreover, the AutoVacuumGetFreeParallelWorkers function doesn't
seem completely useless in isolation from tests. I suggest leaving
this function and its usage in the tests. I can remove the "For testing
purpose only!" comment, so everyone will be free to use this function
in the future.
> For test4 and test5, we check the number of free workers using
> get_parallel_av_free_workers(). However, since autovacuum
> could retry to vacuum the table again, the test could fail.
Yep, good catch.
1)
Test 5 can be stabilized as follows :
We can attach to the "autovacuum-start-parallel-vacuum" injection point in
the "wait" mode. Thereby when we terminate the first a/v leader, we are
guaranteed that no other a/v leader will reach release/reserve functions.
And then we are free to call the get_parallel_autovacuum_free_workers
function. I'll additionally describe this logic in the test.
2)
In the test 4 I found another problem : when a/v leader errors out, it will
exit() pretty soon. And during exit() it will call the before_shmem_exit hook.
Thus, we cannot be sure that parallel workers has been released exactly
in the try/catch block. In order to guarantee it, I think that we should log
something inside the try/catch block. I added a pretty controversial loggin
code for it, but it is the best I came up with.
In the test 4 the above idea will look something like this:
$log_start = $node->wait_for_log(
qr/error triggered for injection point / .
qr/autovacuum-leader-before-indexes-processing/,
$log_start
);
$log_start = $node->wait_for_log(
qr/2 parallel autovacuum workers has been released after occured error/,
$log_start
);
Above I described a problem that may occur when we advance
"logfile offset" too far after the first wait_for_log call. Here, this problem
doesn't occur because the autovacuum launcher infinitely tries to
vacuum the table, so other "N workers released" messages occur.
--
> And here are some general comments and suggestions:
>
> +use warnings FATAL => 'all';
> +use PostgreSQL::Test::Cluster;
> +use PostgreSQL::Test::Utils;
> +use Test::More;
>
> We need comments to explain what we test with this test file.
>
OK, I'll add it. I suppose I can limit myself to a simple
"Test parallel autovacuum behavior", because the specific test scenarios
are described below.
--
> + $node->safe_psql('postgres', qq{
> + UPDATE test_autovac SET col_1 = $test_number;
> + ANALYZE test_autovac;
> + });
>
> Why do we need to execute ANALYZE as well?
I added ANALYZE just in case. But I see that statistics of deleted and
updated tuples is accumulated at the end of the transaction, so I agree
that we can get rid of ANALYZE here.
--
> +# Insert specified tuples num into the table
> +$node->safe_psql('postgres', qq{
> + DO \$\$
> + DECLARE
> + i INTEGER;
> + BEGIN
> + FOR i IN 1..$initial_rows_num LOOP
> + INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
> + END LOOP;
> + END \$\$;
> +});
>
> We can use generate_series() here. And it's faster to load the data
> and then create indexes.
OK, I'll fix it.
--
> +$node->psql('postgres',
> + "SELECT get_parallel_autovacuum_free_workers();",
> + stdout => \$psql_out,
> +);
>
> Please use pgsql_safe() instead.
Sure!
--
Again, thanks everyone for the review!
I hope I didn't miss anything.
Please, see updated sets of patches.
This time I'll try something experimental - besides the patches I'll also
post differences between corresponding patches from v20 and v21.
I.e. you can apply v20--v21-diff-for-0001 on the v20-0001 patch and
get the v21-0001 patch. There are a lot of changes, so I guess it will
help you during review. Please, let me know whether it is useful for you.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v21-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 2-v21-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From ed32c12c05dac5fbfca55c7424f0c80d9d58fef5 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v21 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 37342986969..a6869b03753 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2886,6 +2886,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9351,6 +9352,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-parallel-workers"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 77c5a763d45..3592c9acff9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v21-0001-Parallel-autovacuum.patch (19.4K, 3-v21-0001-Parallel-autovacuum.patch)
download | inline diff:
From 32064b095f6319033bb87e63c16c5f1e323bfe0f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v21 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 239 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..9459a010cc3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c3b3c9ea21a..d3e0c32b7ee 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseAllParallelWorkers();
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6fde740465f..f40abe90ed5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,13 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +292,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -361,6 +372,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -759,6 +771,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -775,6 +789,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1379,6 +1402,16 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -2275,6 +2308,12 @@ do_autovacuum(void)
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that all
+ * reserved workers are released even after FATAL error.
+ */
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Perform operations on collected tables.
*/
@@ -2456,6 +2495,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2856,8 +2901,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3334,6 +3383,88 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Reserves parallel workers for autovacuum.
+ *
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
+ *
+ * The caller must call AutoVacuumRelease[All]ParallelWorkers() to release the
+ * reserved workers.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader autovacuum worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* The worker must not have any reserved workers yet */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+}
+
+/*
+ * Releases the reserved parallel workers for autovacuum.
+ *
+ * This function should be used to release the parallel workers that an
+ * autovacuum worker reserved by AutoVacuumReserveParallelWorkers(). nworkers
+ * is the number of workers to release, which must not be greater than the
+ * number of workers currently reserved, av_nworkers_reserved.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Cannot release more workers than reserved */
+ Assert(nworkers <= av_nworkers_reserved);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+}
+
+/*
+ * Same as above, but this function releases all the parallel workers that
+ * this autovacuum worker reserved.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3394,6 +3525,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_parallel_workers);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3475,3 +3610,28 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ int nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae9d5f3fb70..c8a99a67767 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 762b8efe6b0..361bdb9a720 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 6e82c8e055d..2d4b9d27e8b 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -709,6 +709,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8b91bc00062..ed59a21289c 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index db559b39c4d..ad6e19f426c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..f3783afb51b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,6 +62,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..7c5e35a486c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v21-0002-Logging-for-parallel-autovacuum.patch (10.2K, 4-v21-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From ba2a21114126d6c2b9ea8629a7299332e573136a Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v21 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 61 ++++++++++++++++++++++++++-
src/backend/commands/vacuumparallel.c | 29 ++++++++++---
src/include/commands/vacuum.h | 28 +++++++++++-
src/tools/pgindent/typedefs.list | 3 ++
4 files changed, 111 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 4be267ff657..d19e15cbcce 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -340,6 +340,12 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -778,6 +784,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->vm_new_visible_frozen_pages = 0;
vacrel->vm_new_frozen_pages = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1120,6 +1131,50 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ /* Stats for vacuum phase of index vacuuming. */
+
+ if (AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum. */
+ appendStringInfo(&buf,
+ _("parallel index vacuum: %d workers were planned, %d workers were reserved and %d workers were launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nreserved,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ else
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL). */
+ appendStringInfo(&buf,
+ _("parallel index vacuum: %d workers were planned and %d workers were launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ }
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ /* Stats for cleanup phase of index vacuuming. */
+
+ if (AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum. */
+ appendStringInfo(&buf,
+ _("parallel index cleanup: %d workers were planned, %d workers were reserved and %d workers were launched in total\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nreserved,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ else
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL). */
+ appendStringInfo(&buf,
+ _("parallel index cleanup: %d workers were planned and %d workers were launched in total\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2664,7 +2719,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3097,7 +3153,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index d3e0c32b7ee..86d9f2b74c9 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -521,7 +522,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +535,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -618,7 +621,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -655,13 +658,23 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
*/
if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ {
AutoVacuumReserveParallelWorkers(&nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nreserved += nworkers;
+ }
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -728,6 +741,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..d3dc4e8cc67 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,28 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ */
+typedef struct PVWorkersStats
+{
+ int nplanned; /* # of parallel workers we are planned to
+ * launch */
+ int nreserved; /* for autovacuum only - # of parallel workers
+ * we have managed to reserve */
+ int nlaunched; /* # of launched parallel workers */
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched, reserved
+ * and planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +416,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index df42b78bc9d..d84308c87ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2067,6 +2067,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
@@ -2405,6 +2407,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[text/x-patch] v21-0004-Tests-for-parallel-autovacuum.patch (23.3K, 5-v21-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 6a9900a79e72e7e5366265ae074939742006ea08 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v21 4/5] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 7 +
src/backend/commands/vacuumparallel.c | 55 +++
src/backend/postmaster/autovacuum.c | 28 ++
src/include/postmaster/autovacuum.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 28 ++
src/test/modules/test_autovacuum/meson.build | 36 ++
.../modules/test_autovacuum/t/001_basic.pl | 332 ++++++++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 +
.../modules/test_autovacuum/test_autovacuum.c | 41 +++
.../test_autovacuum/test_autovacuum.control | 3 +
13 files changed, 547 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d19e15cbcce..2e85f7f17f7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -151,6 +151,7 @@
#include "storage/freespace.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -869,6 +870,12 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ccb3812165c..f9f29f940c9 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -303,6 +304,10 @@ static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_inde
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+#ifdef USE_INJECTION_POINTS
+static void parallel_vacuum_report_cost_based_params(void);
+#endif
+
/*
* Try to enter parallel mode and create a parallel context. Then initialize
* shared memory state.
@@ -922,6 +927,17 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ *
+ * This injection point is also used to wait until parallel workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1299,6 +1315,16 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * If we are parallel autovacuum worker, we can consume delay parameters
+ * during index processing (via vacuum_delay_point call). This logging
+ * allows tests to ensure this.
+ */
+ if (shared->am_parallel_autovacuum)
+ parallel_vacuum_report_cost_based_params();
+#endif
+
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
@@ -1351,3 +1377,32 @@ parallel_vacuum_error_callback(void *arg)
return;
}
}
+
+#ifdef USE_INJECTION_POINTS
+/*
+ * Log values of the related to cost-based delay parameters. It is used for
+ * testing purpose.
+ */
+static void
+parallel_vacuum_report_cost_based_params(void)
+{
+ StringInfoData buf;
+
+ /* Simulate config reload during normal processing */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+ vacuum_delay_point(false);
+ pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+ initStringInfo(&buf);
+
+ appendStringInfo(&buf, "Vacuum cost-based delay parameters of parallel worker:\n");
+ appendStringInfo(&buf, "vacuum_cost_limit = %d\n",vacuum_cost_limit);
+ appendStringInfo(&buf, "vacuum_cost_delay = %g\n", vacuum_cost_delay);
+ appendStringInfo(&buf, "vacuum_cost_page_miss = %d\n", VacuumCostPageMiss);
+ appendStringInfo(&buf, "vacuum_cost_page_dirty = %d\n", VacuumCostPageDirty);
+ appendStringInfo(&buf, "vacuum_cost_page_hit = %d\n", VacuumCostPageHit);
+
+ ereport(DEBUG2, errmsg("%s", buf.data));
+ pfree(buf.data);
+}
+#endif
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 0d78d02bd09..7b24a5d6e67 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2495,12 +2495,20 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ int nreserved_workers = av_nworkers_reserved;
+
/*
* Parallel autovacuum can reserve parallel workers. Make sure
* that all reserved workers are released.
*/
AutoVacuumReleaseAllParallelWorkers();
+ if (nreserved_workers > 0)
+ ereport(DEBUG2,
+ (errmsg("%d parallel autovacuum workers has been released after occured error",
+ nreserved_workers),
+ errhidecontext(true)));
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -3465,6 +3473,21 @@ AutoVacuumReleaseAllParallelWorkers(void)
Assert(av_nworkers_reserved == 0);
}
+/*
+ * Get number of free autovacuum parallel workers.
+ */
+uint32
+AutoVacuumGetFreeParallelWorkers(void)
+{
+ uint32 nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_SHARED);
+ nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
+ LWLockRelease(AutovacuumLock);
+
+ return nfree_workers;
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3633,5 +3656,10 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+ ereport(DEBUG2,
+ (errmsg("number of free parallel autovacuum workers is set to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers),
+ errhidecontext(true)));
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index f3783afb51b..52be260e15f 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
extern void AutoVacuumReleaseAllParallelWorkers(void);
+extern uint32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 44c7163c1cd..937dbb64fd2 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..5ac8d87702d 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..32254c53a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..b3d22361dcf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,332 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ });
+
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+}
+
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$log_start = $node->wait_for_log(
+ qr/parallel index vacuum: 2 workers were planned, / .
+ qr/2 workers were reserved and 2 workers were launched in total/,
+ $log_start
+);
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 20, 'All parallel workers has been released by the leader');
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/Vacuum cost-based delay parameters of parallel worker:\n/ .
+ qr/\tvacuum_cost_limit = 500\n/ .
+ qr/\tvacuum_cost_delay = 2\n/ .
+ qr/\tvacuum_cost_page_miss = 10\n/ .
+ qr/\tvacuum_cost_page_dirty = 10\n/ .
+ qr/\tvacuum_cost_page_hit = 10\n/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# Test 3:
+# Test adjustment of free parallel workers number when changing
+# autovacuum_max_parallel_workers parameter
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 1;
+ SELECT pg_reload_conf();
+});
+
+# Since 2 parallel workers already launched and will be released in the future,
+# we are expecting that :
+# 1) number of free workers will be '0' after config reload
+# 2) number of free workers will be '1' after releasing workers
+
+# Check statement (1)
+$log_start = $node->wait_for_log(
+ qr/number of free parallel autovacuum workers is set to 0 due to config reload/,
+ $log_start
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until the end of parallel processing
+$log_start = $node->wait_for_log(
+ qr/parallel index vacuum: 2 workers were planned, / .
+ qr/2 workers were reserved and 2 workers were launched in total/,
+ $log_start
+);
+
+# Check statement (2)
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 1, 'Number of free parallel workers is consistent');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
+ SELECT pg_reload_conf();
+});
+
+# Test 4:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exits due to an ERROR.
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'error');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$log_start = $node->wait_for_log(
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/,
+ $log_start
+);
+
+$log_start = $node->wait_for_log(
+ qr/2 parallel autovacuum workers has been released after occured error/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 5:
+# Same as above test, but simulate situation, when leader exits due to FATAL.
+
+prepare_for_next_test($node, 5);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited and wake up the leader
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+my $av_pid = $node->safe_psql('postgres', qq{
+ SELECT pid FROM pg_stat_activity
+ WHERE backend_type = 'autovacuum worker'
+ AND wait_event = 'autovacuum-leader-before-indexes-processing'
+ LIMIT 1;
+});
+
+# Create role with pg_signal_autovacuum_worker for terminating autovacuum worker.
+$node->safe_psql('postgres', qq{
+ CREATE ROLE regress_worker_role;
+ GRANT pg_signal_autovacuum_worker TO regress_worker_role;
+ SET ROLE regress_worker_role;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
+
+$log_start = $node->wait_for_log(
+ qr/terminating autovacuum process due to administrator command/,
+ $log_start
+);
+
+# Now it is safe to check the number of free parallel workers, because even if
+# autovacuum is trying to vacuum table in parallel mode again, the leader
+# worker cannot go any further than "autovacuum-start-parallel-vacuum" point.
+# I.e. no one can interfere and change the number of free parallel workers.
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 10, 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..e5646e0def5
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting shared autovacuum state
+ */
+
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..959629c7685
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,41 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "commands/vacuum.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_UINT32(nfree_workers);
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v21-0003-Cost-based-parameters-propagation-for-parallel-a.patch (9.8K, 6-v21-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From 21cbbbe37e36b53ac70b9827296a3430aba4680f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v21 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 23 +++-
src/backend/commands/vacuumparallel.c | 164 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 2 +
5 files changed, 190 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 03932f45c8a..70882544d05 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2430,8 +2430,21 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Possibly update cost-based delay parameters.
+ *
+ * Do it before checking VacuumCostActive, because its value might be
+ * changed after calling this function.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2445,6 +2458,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * If we are parallel autovacuum leader and some of cost-based
+ * parameters had changed, let other parallel workers know.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 86d9f2b74c9..ccb3812165c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -53,6 +53,56 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Helper for the PVSharedCostParams structure (see below), to avoid
+ * repetition.
+ */
+typedef struct CostParamsData
+{
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} CostParamsData;
+
+#define FillCostParamsData(cost_params) \
+ (cost_params)->cost_delay = vacuum_cost_delay; \
+ (cost_params)->cost_limit = vacuum_cost_limit; \
+ (cost_params)->cost_page_dirty = VacuumCostPageDirty; \
+ (cost_params)->cost_page_hit = VacuumCostPageHit; \
+ (cost_params)->cost_page_miss = VacuumCostPageMiss
+
+#define CostParamsDataEqual(params_1, params_2) \
+ ((params_1).cost_delay == (params_2).cost_delay && \
+ (params_1).cost_limit == (params_2).cost_limit && \
+ (params_1).cost_page_dirty == (params_2).cost_page_dirty && \
+ (params_1).cost_page_hit == (params_2).cost_page_hit && \
+ (params_1).cost_page_miss == (params_2).cost_page_miss)
+
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * Each time leader worker updates its parameters, it must increase
+ * generation. Every parallel worker keeps the generation
+ * (shared_params_local_generation) at which it had last time received
+ * parameters from the leader.
+ *
+ * It is enough for worker to compare it's local_generation with the field
+ * below to determine whether it needs to receive new parameters' values.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Copies of corresponding parameters from autovacuum leader process */
+ CostParamsData params_data;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -122,6 +172,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool am_parallel_autovacuum;
+
+ /*
+ * Struct for syncing parameters between supportive parallel autovacuum
+ * workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -224,6 +286,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments for the PVSharedCostParams structure for the explanation. */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -395,6 +462,17 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->am_parallel_autovacuum = AmAutoVacuumWorkerProcess();
+
+ if (shared->am_parallel_autovacuum)
+ {
+ FillCostParamsData(&shared->cost_params.params_data);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -539,6 +617,89 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
&wusage->cleanup);
}
+/*
+ * If we are parallel *autovacuum* worker, check whether related to
+ * cost-based delay parameters had changed in the leader worker. If
+ * so, corresponding parameters will be updated to the values which
+ * leader worker is operating on.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+ CostParamsData shared_params_data;
+
+ Assert(IsParallelWorker());
+
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+
+ shared_params_data = pv_shared_cost_params->params_data;
+
+ VacuumCostDelay = shared_params_data.cost_delay;
+ VacuumCostLimit = shared_params_data.cost_limit;
+ VacuumCostPageDirty = shared_params_data.cost_page_dirty;
+ VacuumCostPageHit = shared_params_data.cost_page_hit;
+ VacuumCostPageMiss = shared_params_data.cost_page_miss;
+
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Function to be called from parallel autovacuum leader in order to propagate
+ * some cost-based parameters to the supportive workers.
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ CostParamsData local_params_data;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ FillCostParamsData(&local_params_data);
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+
+ if (CostParamsDataEqual(pv_shared_cost_params->params_data,
+ local_params_data))
+ {
+ /*
+ * We don't need to update shared delay params if they haven't
+ * changed.
+ */
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+ return;
+ }
+
+ FillCostParamsData(&pv_shared_cost_params->params_data);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increase generation of the parameters, i.e. let parallel workers know
+ * that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1105,6 +1266,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->am_parallel_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f40abe90ed5..0d78d02bd09 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1690,7 +1690,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index d3dc4e8cc67..b10829a9379 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -423,6 +423,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d84308c87ad..d00b57d3186 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -544,6 +544,7 @@ CopyToRoutine
CopyToState
CopyToStateData
Cost
+CostParamsData
CostSelector
Counters
CoverExt
@@ -2067,6 +2068,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkersUsage
PVWorkersStats
PX_Alias
--
2.43.0
[text/x-patch] v20--v21-diff-for-0003.patch (9.9K, 7-v20--v21-diff-for-0003.patch)
download | inline diff:
From 5e2b470db102546c3124da1eef8acc42d5c2fead Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 7 Feb 2026 11:07:44 +0700
Subject: [PATCH 5/7] fixes for patch 3
---
src/backend/commands/vacuum.c | 14 +--
src/backend/commands/vacuumparallel.c | 120 ++++++++++++++++----------
src/include/commands/vacuum.h | 2 +-
src/tools/pgindent/typedefs.list | 2 +
4 files changed, 82 insertions(+), 56 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 9847ed1c2da..70882544d05 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2433,19 +2433,13 @@ vacuum_delay_point(bool is_analyze)
if (InterruptPending)
return;
- if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
+ if (IsParallelWorker())
{
/*
- * If we are parallel *autovacuum* worker, check whether related to
- * cost-based delay parameters had changed in the leader worker. If
- * so, corresponding parameters will be updated to the values which
- * leader worker is operating on.
+ * Possibly update cost-based delay parameters.
*
* Do it before checking VacuumCostActive, because its value might be
- * changed after leader's parameters consumption.
- *
- * Note, that this function has no effect if we are non-autovacuum
- * parallel worker.
+ * changed after calling this function.
*/
parallel_vacuum_update_shared_delay_params();
}
@@ -2469,7 +2463,7 @@ vacuum_delay_point(bool is_analyze)
* If we are parallel autovacuum leader and some of cost-based
* parameters had changed, let other parallel workers know.
*/
- parallel_vacuum_propagate_cost_based_params();
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 640173eada8..ccb3812165c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -54,9 +54,35 @@
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
/*
- * Only autovacuum leader can reload config file. We use this structure in
- * parallel autovacuum for keeping worker's parameters in sync with leader's
- * parameters.
+ * Helper for the PVSharedCostParams structure (see below), to avoid
+ * repetition.
+ */
+typedef struct CostParamsData
+{
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} CostParamsData;
+
+#define FillCostParamsData(cost_params) \
+ (cost_params)->cost_delay = vacuum_cost_delay; \
+ (cost_params)->cost_limit = vacuum_cost_limit; \
+ (cost_params)->cost_page_dirty = VacuumCostPageDirty; \
+ (cost_params)->cost_page_hit = VacuumCostPageHit; \
+ (cost_params)->cost_page_miss = VacuumCostPageMiss
+
+#define CostParamsDataEqual(params_1, params_2) \
+ ((params_1).cost_delay == (params_2).cost_delay && \
+ (params_1).cost_limit == (params_2).cost_limit && \
+ (params_1).cost_page_dirty == (params_2).cost_page_dirty && \
+ (params_1).cost_page_hit == (params_2).cost_page_hit && \
+ (params_1).cost_page_miss == (params_2).cost_page_miss)
+
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
*/
typedef struct PVSharedCostParams
{
@@ -71,20 +97,11 @@ typedef struct PVSharedCostParams
*/
pg_atomic_uint32 generation;
- slock_t spinlock; /* protects all fields below */
+ slock_t mutex; /* protects all fields below */
/* Copies of corresponding parameters from autovacuum leader process */
- double cost_delay;
- int cost_limit;
- int cost_page_dirty;
- int cost_page_hit;
- int cost_page_miss;
-} PVSharedCostParams;
-
-static PVSharedCostParams * pv_shared_cost_params = NULL;
-
-/* See comments for structure above for the explanation. */
-static uint32 shared_params_generation_local = 0;
+ CostParamsData params_data;
+} PVSharedCostParams;
/*
* Shared information among parallel workers. So this is allocated in the DSM
@@ -269,6 +286,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments for the PVSharedCostParams structure for the explanation. */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -444,13 +466,10 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
if (shared->am_parallel_autovacuum)
{
- shared->cost_params.cost_delay = vacuum_cost_delay;
- shared->cost_params.cost_limit = vacuum_cost_limit;
- shared->cost_params.cost_page_dirty = VacuumCostPageDirty;
- shared->cost_params.cost_page_hit = VacuumCostPageHit;
- shared->cost_params.cost_page_miss = VacuumCostPageMiss;
+ FillCostParamsData(&shared->cost_params.params_data);
pg_atomic_init_u32(&shared->cost_params.generation, 0);
- SpinLockInit(&shared->cost_params.spinlock);
+ SpinLockInit(&shared->cost_params.mutex);
+
pv_shared_cost_params = &(shared->cost_params);
}
@@ -599,13 +618,18 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
}
/*
- * Function to be called from parallel autovacuum worker in order to sync
- * some cost-based delay parameter with the leader worker.
+ * If we are parallel *autovacuum* worker, check whether related to
+ * cost-based delay parameters had changed in the leader worker. If
+ * so, corresponding parameters will be updated to the values which
+ * leader worker is operating on.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
*/
void
parallel_vacuum_update_shared_delay_params(void)
{
- uint32 params_generation;
+ uint32 params_generation;
+ CostParamsData shared_params_data;
Assert(IsParallelWorker());
@@ -613,22 +637,24 @@ parallel_vacuum_update_shared_delay_params(void)
if (pv_shared_cost_params == NULL)
return;
- params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
Assert(shared_params_generation_local <= params_generation);
/* Return if parameters had not changed in the leader */
if (params_generation == shared_params_generation_local)
return;
- SpinLockAcquire(&pv_shared_cost_params->spinlock);
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
- VacuumCostDelay = pv_shared_cost_params->cost_delay;
- VacuumCostLimit = pv_shared_cost_params->cost_limit;
- VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
- VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
- VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ shared_params_data = pv_shared_cost_params->params_data;
- SpinLockRelease(&pv_shared_cost_params->spinlock);
+ VacuumCostDelay = shared_params_data.cost_delay;
+ VacuumCostLimit = shared_params_data.cost_limit;
+ VacuumCostPageDirty = shared_params_data.cost_page_dirty;
+ VacuumCostPageHit = shared_params_data.cost_page_hit;
+ VacuumCostPageMiss = shared_params_data.cost_page_miss;
+
+ SpinLockRelease(&pv_shared_cost_params->mutex);
VacuumUpdateCosts();
@@ -640,9 +666,9 @@ parallel_vacuum_update_shared_delay_params(void)
* some cost-based parameters to the supportive workers.
*/
void
-parallel_vacuum_propagate_cost_based_params(void)
+parallel_vacuum_propagate_shared_delay_params(void)
{
- uint32 params_generation;
+ CostParamsData local_params_data;
Assert(AmAutoVacuumWorkerProcess());
@@ -650,24 +676,28 @@ parallel_vacuum_propagate_cost_based_params(void)
if (pv_shared_cost_params == NULL)
return;
- params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ FillCostParamsData(&local_params_data);
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
- SpinLockAcquire(&pv_shared_cost_params->spinlock);
+ if (CostParamsDataEqual(pv_shared_cost_params->params_data,
+ local_params_data))
+ {
+ /*
+ * We don't need to update shared delay params if they haven't
+ * changed.
+ */
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+ return;
+ }
- pv_shared_cost_params->cost_delay = vacuum_cost_delay;
- pv_shared_cost_params->cost_limit = vacuum_cost_limit;
- pv_shared_cost_params->cost_page_dirty = VacuumCostPageDirty;
- pv_shared_cost_params->cost_page_hit = VacuumCostPageHit;
- pv_shared_cost_params->cost_page_miss = VacuumCostPageMiss;
+ FillCostParamsData(&pv_shared_cost_params->params_data);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
/*
* Increase generation of the parameters, i.e. let parallel workers know
* that they should re-read shared cost params.
*/
- pg_atomic_write_u32(&pv_shared_cost_params->generation,
- params_generation + 1);
-
- SpinLockRelease(&pv_shared_cost_params->spinlock);
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
}
/*
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bd67e7748f0..b10829a9379 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -424,7 +424,7 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
bool estimated_count,
PVWorkersUsage *wusage);
extern void parallel_vacuum_update_shared_delay_params(void);
-extern void parallel_vacuum_propagate_cost_based_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 84bfa2970de..28b91d69086 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -544,6 +544,7 @@ CopyToRoutine
CopyToState
CopyToStateData
Cost
+CostParamsData
CostSelector
Counters
CoverExt
@@ -2066,6 +2067,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkersUsage
PVWorkersStats
PX_Alias
--
2.43.0
[text/x-patch] v20--v21-diff-for-0001.patch (3.5K, 8-v20--v21-diff-for-0001.patch)
download | inline diff:
From cb6f66f944dbb48a31c90823dd23a6a5d6313250 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 7 Feb 2026 00:04:17 +0700
Subject: [PATCH 1/7] fixes for patch 1
---
src/backend/commands/vacuumparallel.c | 5 ++++-
src/backend/postmaster/autovacuum.c | 18 +++++-------------
src/backend/utils/misc/postgresql.conf.sample | 2 +-
3 files changed, 10 insertions(+), 15 deletions(-)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 1e35b82aeaf..d3e0c32b7ee 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -710,8 +710,11 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
* Tell autovacuum that we could not launch all the previously
* reserved workers.
*/
- if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched < nworkers)
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
if (pvs->pcxt->nworkers_launched > 0)
{
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 784c1178d61..f40abe90ed5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3526,7 +3526,7 @@ AutoVacuumShmemInit(void)
AutoVacuumShmem->av_launcherpid = 0;
AutoVacuumShmem->av_maxParallelWorkers =
- Min(autovacuum_max_parallel_workers, max_worker_processes);
+ Min(autovacuum_max_parallel_workers, max_parallel_workers);
AutoVacuumShmem->av_freeParallelWorkers =
AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
@@ -3622,23 +3622,15 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
- nfree_workers =
- autovacuum_max_parallel_workers - prev_max_parallel_workers +
- AutoVacuumShmem->av_freeParallelWorkers;
-
/*
* Cap or increase number of free parallel workers according to the
* parameter change.
*/
- AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
-
- /*
- * Don't allow number of free workers to become less than zero if the
- * patameter was decreased.
- */
- AutoVacuumShmem->av_freeParallelWorkers =
- Max(AutoVacuumShmem->av_freeParallelWorkers, 0);
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
LWLockRelease(AutovacuumLock);
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 35c37f21239..e456fd759eb 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -695,7 +695,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
-#autovacuum_max_parallel_workers = 2 # limited by max_worker_processes
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
--
2.43.0
[text/x-patch] v20--v21-diff-for-0002.patch (8.3K, 9-v20--v21-diff-for-0002.patch)
download | inline diff:
From b7a0226646ee306400ca50c5404f3f02b0c7fda0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 7 Feb 2026 00:11:49 +0700
Subject: [PATCH 3/7] fixes for patch 2
---
src/backend/access/heap/vacuumlazy.c | 60 ++++++++++++++++++++-------
src/backend/commands/vacuumparallel.c | 22 +++++-----
src/include/commands/vacuum.h | 15 +++++--
src/tools/pgindent/typedefs.list | 2 +
4 files changed, 70 insertions(+), 29 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d14f055b40d..d19e15cbcce 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -784,8 +784,10 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->vm_new_visible_frozen_pages = 0;
vacrel->vm_new_frozen_pages = 0;
- vacrel->workers_usage.nlaunched = 0;
- vacrel->workers_usage.nplanned = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
@@ -1129,23 +1131,49 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
- if (vacrel->workers_usage.nplanned > 0 &&
- AmAutoVacuumWorkerProcess())
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
{
- /* Worker usage stats for parallel autovacuum */
- appendStringInfo(&buf,
- _("parallel index vacuum/cleanup: %d workers were planned, %d workers were reserved and %d workers were launched in total\n"),
- vacrel->workers_usage.nplanned,
- vacrel->workers_usage.nreserved,
- vacrel->workers_usage.nlaunched);
+ /* Stats for vacuum phase of index vacuuming. */
+
+ if (AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum. */
+ appendStringInfo(&buf,
+ _("parallel index vacuum: %d workers were planned, %d workers were reserved and %d workers were launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nreserved,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ else
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL). */
+ appendStringInfo(&buf,
+ _("parallel index vacuum: %d workers were planned and %d workers were launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
}
- else if (vacrel->workers_usage.nplanned > 0)
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
{
- /* Worker usage stats for manual VACUUM (PARALLEL) */
- appendStringInfo(&buf,
- _("parallel index vacuum/cleanup: %d workers were planned and %d workers were launched in total\n"),
- vacrel->workers_usage.nplanned,
- vacrel->workers_usage.nlaunched);
+ /* Stats for cleanup phase of index vacuuming. */
+
+ if (AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum. */
+ appendStringInfo(&buf,
+ _("parallel index cleanup: %d workers were planned, %d workers were reserved and %d workers were launched in total\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nreserved,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ else
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL). */
+ appendStringInfo(&buf,
+ _("parallel index cleanup: %d workers were planned and %d workers were launched in total\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
}
for (int i = 0; i < vacrel->nindexes; i++)
{
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ea45dc3fc37..86d9f2b74c9 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum, PVWorkersUsage *wusage);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -513,7 +513,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -534,7 +535,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -619,7 +621,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum, PVWorkersUsage *wusage)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -657,8 +659,8 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
nworkers = Min(nworkers, pvs->pcxt->nworkers);
/* Remember this value, if we asked to */
- if (wusage != NULL && nworkers > 0)
- wusage->nplanned += nworkers;
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
/*
* Reserve workers in autovacuum global state. Note that we may be given
@@ -669,8 +671,8 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
AutoVacuumReserveParallelWorkers(&nworkers);
/* Remember this value, if we asked to */
- if (wusage != NULL)
- wusage->nreserved += nworkers;
+ if (wstats != NULL)
+ wstats->nreserved += nworkers;
}
/*
@@ -741,8 +743,8 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
/* Remember this value, if we asked to */
- if (wusage != NULL)
- wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7cbb59d124f..d3dc4e8cc67 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -301,16 +301,25 @@ typedef struct VacDeadItemsInfo
} VacDeadItemsInfo;
/*
- * PVWorkersUsage stores information about total number of launched, reserved
- * and planned workers during parallel vacuum.
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
*/
-typedef struct PVWorkersUsage
+typedef struct PVWorkersStats
{
int nplanned; /* # of parallel workers we are planned to
* launch */
int nreserved; /* for autovacuum only - # of parallel workers
* we have managed to reserve */
int nlaunched; /* # of launched parallel workers */
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched, reserved
+ * and planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
} PVWorkersUsage;
/* GUC parameters */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1988cd874fd..84bfa2970de 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2066,6 +2066,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v20--v21-diff-for-0004.patch (18.3K, 10-v20--v21-diff-for-0004.patch)
download | inline diff:
From 82c1f442ba5e18e3d6dc3cb4ed4f24cd3a8d910f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 10 Feb 2026 21:31:14 +0700
Subject: [PATCH 7/7] fixes for patch 4
---
src/backend/access/heap/vacuumlazy.c | 7 +
src/backend/commands/vacuumparallel.c | 32 ++-
src/backend/postmaster/autovacuum.c | 19 +-
.../modules/test_autovacuum/t/001_basic.pl | 188 ++++++++----------
4 files changed, 121 insertions(+), 125 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d19e15cbcce..2e85f7f17f7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -151,6 +151,7 @@
#include "storage/freespace.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -869,6 +870,12 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 13649747322..5dad19d8ed8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -928,6 +928,9 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
* To be able to exercise whether all reserved parallel workers are being
* released anyway, allow injection points to trigger a failure at this
* point.
+ *
+ * This injection point is also used to wait until parallel workers
+ * finishes their part of index processing.
*/
if (nworkers > 0)
INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
@@ -941,15 +944,6 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
parallel_vacuum_process_safe_indexes(pvs);
- /*
- * To be able to exercise whether leader parallel autovacuum worker can
- * propagate cost-based params to parallel workers, wait here until
- * configuration is changed. I.e. tests are expecting, that during index
- * processing vacuum_delay_point have been called (if config was changed).
- */
- if (nworkers > 0)
- INJECTION_POINT("autovacuum-leader-after-indexes-processing", NULL);
-
/*
* Next, accumulate buffer and WAL usage. (This must wait for the workers
* to finish, or we might get incomplete data.)
@@ -1315,20 +1309,16 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Prepare to track buffer usage during parallel execution */
InstrStartParallelQuery();
- INJECTION_POINT("parallel-worker-before-indexes-processing", NULL);
-
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
#ifdef USE_INJECTION_POINTS
/*
- * There is no guarantee that each parallel worker will necessarily
- * process at least one index. Thus, at this point we cannot be sure that
- * worker called vacuum_cost_delay. In order to test cost-based parameters
- * propagation (from leader worker), call vacuum_delay_point here, if
- * injection point is active.
+ * If we are parallel autovacuum worker, we can consume delay parameters
+ * during index processing (via vacuum_delay_point call). This logging
+ * allows tests to ensure this.
*/
- if (IS_INJECTION_POINT_ATTACHED("parallel-autovacuum-force-delay-point"))
+ if (shared->am_parallel_autovacuum)
parallel_vacuum_report_cost_based_params();
#endif
@@ -1392,6 +1382,7 @@ parallel_vacuum_error_callback(void *arg)
static void
parallel_vacuum_report_cost_based_params(void)
{
+#ifdef USE_INJECTION_POINTS
StringInfoData buf;
/* Simulate config reload during normal processing */
@@ -1402,12 +1393,15 @@ parallel_vacuum_report_cost_based_params(void)
initStringInfo(&buf);
appendStringInfo(&buf, "Vacuum cost-based delay parameters of parallel worker:\n");
- appendStringInfo(&buf,"vacuum_cost_limit = %d\n",vacuum_cost_limit);
+ appendStringInfo(&buf, "vacuum_cost_limit = %d\n",vacuum_cost_limit);
appendStringInfo(&buf, "vacuum_cost_delay = %g\n", vacuum_cost_delay);
appendStringInfo(&buf, "vacuum_cost_page_miss = %d\n", VacuumCostPageMiss);
appendStringInfo(&buf, "vacuum_cost_page_dirty = %d\n", VacuumCostPageDirty);
appendStringInfo(&buf, "vacuum_cost_page_hit = %d\n", VacuumCostPageHit);
- ereport(LOG, errmsg("%s", buf.data));
+ ereport(DEBUG2, errmsg("%s", buf.data));
pfree(buf.data);
+#else
+ elog(ERROR, "Injection points are not supported by this build");
+#endif
}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index b9ff60be0f2..7b24a5d6e67 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -800,8 +800,6 @@ ProcessAutoVacLauncherInterrupts(void)
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
-
- INJECTION_POINT("autovacuum-launcher-after-reload-config", NULL);
}
/* Process barrier events */
@@ -2497,12 +2495,20 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ int nreserved_workers = av_nworkers_reserved;
+
/*
* Parallel autovacuum can reserve parallel workers. Make sure
* that all reserved workers are released.
*/
AutoVacuumReleaseAllParallelWorkers();
+ if (nreserved_workers > 0)
+ ereport(DEBUG2,
+ (errmsg("%d parallel autovacuum workers has been released after occured error",
+ nreserved_workers),
+ errhidecontext(true)));
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -3469,15 +3475,13 @@ AutoVacuumReleaseAllParallelWorkers(void)
/*
* Get number of free autovacuum parallel workers.
- *
- * For testing purpose only!
*/
uint32
AutoVacuumGetFreeParallelWorkers(void)
{
uint32 nfree_workers;
- LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ LWLockAcquire(AutovacuumLock, LW_SHARED);
nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
LWLockRelease(AutovacuumLock);
@@ -3652,5 +3656,10 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+ ereport(DEBUG2,
+ (errmsg("number of free parallel autovacuum workers is set to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers),
+ errhidecontext(true)));
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
index 065a58ef2e6..c5d8fffc47c 100644
--- a/src/test/modules/test_autovacuum/t/001_basic.pl
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -1,3 +1,5 @@
+# Test parallel autovacuum behavior
+
use warnings FATAL => 'all';
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
@@ -21,17 +23,9 @@ sub prepare_for_next_test
$node->safe_psql('postgres', qq{
UPDATE test_autovac SET col_1 = $test_number;
- ANALYZE test_autovac;
});
}
-sub wait_for_av_log
-{
- my ($node, $expected_log) = @_;
-
- $node->wait_for_log($expected_log);
- truncate $node->logfile, 0 or die "truncate failed: $!";
-}
my $psql_out;
@@ -71,31 +65,30 @@ my $indexes_num = 4;
my $initial_rows_num = 10_000;
my $autovacuum_parallel_workers = 2;
-# Create table with specified number of b-tree indexes on it
+# Create table and fill it with some data
$node->safe_psql('postgres', qq{
CREATE TABLE test_autovac (
id SERIAL PRIMARY KEY,
col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers);
- DO \$\$
- DECLARE
- i INTEGER;
- BEGIN
- FOR i IN 1..$indexes_num LOOP
- EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
- END LOOP;
- END \$\$;
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
});
-# Insert specified tuples num into the table
+# Create specified number of b-tree indexes on the table
$node->safe_psql('postgres', qq{
DO \$\$
DECLARE
i INTEGER;
BEGIN
- FOR i IN 1..$initial_rows_num LOOP
- INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
END LOOP;
END \$\$;
});
@@ -115,14 +108,15 @@ $node->safe_psql('postgres', qq{
# Wait until the parallel autovacuum on table is completed. At the same time,
# we check that the required number of parallel workers has been started.
-wait_for_av_log($node,
- qr/parallel index vacuum\/cleanup: 2 workers were planned, / .
- qr/2 workers were reserved and 2 workers were launched in total/);
-
-$node->psql('postgres',
- "SELECT get_parallel_autovacuum_free_workers();",
- stdout => \$psql_out,
+$log_start = $node->wait_for_log(
+ qr/parallel index vacuum: 2 workers were planned, / .
+ qr/2 workers were reserved and 2 workers were launched in total/,
+ $log_start
);
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
is($psql_out, 20, 'All parallel workers has been released by the leader');
# Test 2:
@@ -132,19 +126,16 @@ is($psql_out, 20, 'All parallel workers has been released by the leader');
prepare_for_next_test($node, 2);
$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
- SELECT injection_points_attach('autovacuum-leader-after-indexes-processing', 'wait');
- SELECT injection_points_attach('parallel-worker-before-indexes-processing', 'wait');
- SELECT injection_points_attach('parallel-autovacuum-force-delay-point', 'wait');
ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
});
-# Wait until parallel autovacuum leader launches parallel worker and falls
-# asleep on the injection point
+# Wait until parallel autovacuum is inited
$node->wait_for_event(
'autovacuum worker',
- 'autovacuum-leader-before-indexes-processing'
+ 'autovacuum-start-parallel-vacuum'
);
# Reload config - leader worker must update its own parameters during indexes
@@ -158,45 +149,34 @@ $node->safe_psql('postgres', qq{
});
$node->safe_psql('postgres', qq{
- SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
});
-# Wait until leader worker is guaranteed to update parameters and propagate
-# their values to the parallel worker
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
$node->wait_for_event(
'autovacuum worker',
- 'autovacuum-leader-after-indexes-processing'
+ 'autovacuum-leader-before-indexes-processing'
);
-$node->safe_psql('postgres', qq{
- SELECT injection_points_wakeup('autovacuum-leader-after-indexes-processing');
-});
-
-# Now wake up the parallel worker and force it to call vacuum_delay_point
-$node->wait_for_event(
- 'parallel worker',
- 'parallel-worker-before-indexes-processing'
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/Vacuum cost-based delay parameters of parallel worker:\n/ .
+ qr/\tvacuum_cost_limit = 500\n/ .
+ qr/\tvacuum_cost_delay = 2\n/ .
+ qr/\tvacuum_cost_page_miss = 10\n/ .
+ qr/\tvacuum_cost_page_dirty = 10\n/ .
+ qr/\tvacuum_cost_page_hit = 10\n/,
+ $log_start
);
-$node->safe_psql('postgres', qq{
- SELECT injection_points_wakeup('parallel-worker-before-indexes-processing');
-});
-
-# Check whether worker successfully updated all parameters
-wait_for_av_log($node,
- qr/Vacuum cost-based delay parameters of parallel worker:\n/ .
- qr/\tvacuum_cost_limit = 500\n/ .
- qr/\tvacuum_cost_delay = 2\n/ .
- qr/\tvacuum_cost_page_miss = 10\n/ .
- qr/\tvacuum_cost_page_dirty = 10\n/ .
- qr/\tvacuum_cost_page_hit = 10\n/);
-
# Cleanup
$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
- SELECT injection_points_detach('autovacuum-leader-after-indexes-processing');
- SELECT injection_points_detach('parallel-worker-before-indexes-processing');
- SELECT injection_points_detach('parallel-autovacuum-force-delay-point');
ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
});
@@ -209,7 +189,6 @@ prepare_for_next_test($node, 4);
$node->safe_psql('postgres', qq{
SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
- SELECT injection_points_attach('autovacuum-launcher-after-reload-config', 'wait');
ALTER TABLE test_autovac SET (autovacuum_enabled = true);
});
@@ -223,53 +202,44 @@ $node->safe_psql('postgres', qq{
SELECT pg_reload_conf();
});
-$node->wait_for_event(
- 'autovacuum launcher',
- 'autovacuum-launcher-after-reload-config'
-);
-
# Since 2 parallel workers already launched and will be released in the future,
# we are expecting that :
# 1) number of free workers will be '0' after config reload
# 2) number of free workers will be '1' after releasing workers
# Check statement (1)
-$node->psql('postgres',
- "SELECT get_parallel_autovacuum_free_workers();",
- stdout => \$psql_out,
+$log_start = $node->wait_for_log(
+ qr/number of free parallel autovacuum workers is set to 0 due to config reload/,
+ $log_start
);
-is($psql_out, 0,
- 'Number of free parallel workers is consistent');
$node->safe_psql('postgres', qq{
- SELECT injection_points_wakeup('autovacuum-launcher-after-reload-config');
SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
});
# Wait until the end of parallel processing
-wait_for_av_log($node,
- qr/parallel index vacuum\/cleanup: 2 workers were planned, / .
- qr/2 workers were reserved and 2 workers were launched in total/);
+$log_start = $node->wait_for_log(
+ qr/parallel index vacuum: 2 workers were planned, / .
+ qr/2 workers were reserved and 2 workers were launched in total/,
+ $log_start
+);
# Check statement (2)
-$node->psql('postgres',
- "SELECT get_parallel_autovacuum_free_workers();",
- stdout => \$psql_out,
-);
-is($psql_out, 1,
- 'Number of free parallel workers is consistent');
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 1, 'Number of free parallel workers is consistent');
# Cleanup
$node->safe_psql('postgres', qq{
SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
- SELECT injection_points_detach('autovacuum-launcher-after-reload-config');
ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
SELECT pg_reload_conf();
});
# Test 4:
# We want parallel autovacuum workers to be released even if leader gets an
-# error. At first, simulate situation, when leader exites due to an ERROR.
+# error. At first, simulate situation, when leader exits due to an ERROR.
prepare_for_next_test($node, 4);
@@ -278,16 +248,16 @@ $node->safe_psql('postgres', qq{
ALTER TABLE test_autovac SET (autovacuum_enabled = true);
});
-wait_for_av_log($node,
- qr/error triggered for injection point / .
- qr/autovacuum-leader-before-indexes-processing/);
+$log_start = $node->wait_for_log(
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/,
+ $log_start
+);
-$node->psql('postgres',
- "SELECT get_parallel_autovacuum_free_workers();",
- stdout => \$psql_out,
+$log_start = $node->wait_for_log(
+ qr/2 parallel autovacuum workers has been released after occured error/,
+ $log_start
);
-is($psql_out, 10,
- 'All parallel workers has been released by the leader after ERROR');
# Cleanup
$node->safe_psql('postgres', qq{
@@ -295,15 +265,25 @@ $node->safe_psql('postgres', qq{
});
# Test 5:
-# Same as above test, but simulate situation, when leader exites due to FATAL.
+# Same as above test, but simulate situation, when leader exits due to FATAL.
prepare_for_next_test($node, 5);
$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
ALTER TABLE test_autovac SET (autovacuum_enabled = true);
});
+# Wait until parallel autovacuum is inited and wake up the leader
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
$node->wait_for_event(
'autovacuum worker',
'autovacuum-leader-before-indexes-processing'
@@ -327,18 +307,24 @@ $node->safe_psql('postgres', qq{
SELECT pg_terminate_backend('$av_pid');
});
-wait_for_av_log($node,
- qr/terminating autovacuum process due to administrator command/);
-
-$node->psql('postgres',
- "SELECT get_parallel_autovacuum_free_workers();",
- stdout => \$psql_out,
+$log_start = $node->wait_for_log(
+ qr/terminating autovacuum process due to administrator command/,
+ $log_start
);
-is($psql_out, 10,
- 'All parallel workers has been released by the leader after FATAL');
+
+# Now it is safe to check the number of free parallel workers, because even if
+# autovacuum is trying to vacuum table in parallel mode again, the leader
+# worker cannot go any further than "autovacuum-start-parallel-vacuum" point.
+# I.e. no one can interfere and change the number of free parallel workers.
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 10, 'All parallel workers has been released by the leader after FATAL');
# Cleanup
$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
});
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-02-25 23:59 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-02-25 23:59 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed, Feb 11, 2026 at 12:04 AM Daniil Davydov <[email protected]> wrote:
>
> On Thu, Jan 22, 2026 at 5:29 AM Masahiko Sawada <[email protected]> wrote:
> >
> > On Sat, Jan 17, 2026 at 6:52 AM Daniil Davydov <[email protected]> wrote:
> > >
> > > I will keep the 'max_worker_processes' limit, so autovacuum will not
> > > waste time initializing a parallel context if there is no chance that
> > > the request will succeed.
> > > But it's worth remembering that actually the
> > > 'autovacuum_max_parallel_workers' parameter will always be implicitly
> > > capped by 'max_parallel_workers'.
> >
> > It doesn't make sense to me that we limit
> > autovacuum_max_parallel_workers by max_worker_processes TBH. When
> > users want to have more parallel vacuum workers for autovacuum and the
> > VACUUM command, they would have to consider max_worker_processes,
> > max_parallel_workers, and max_parallel_maintenance_workers separately.
> > Given that max_parallel_workers is controlling the number of
> > max_worker_processes that can be used in parallel operations, I
> > believe that parallel vacuum workers for autovacuum should also be
> > taken from that pool.
>
> Maybe I don't quite understand the meaning of "limited by". For example,
> we have a max_parallel_workers_per_gather parameter, which is limited
> by max_parallel_workers. But actually we can set this parameter to a value
> that is higher than max_parallel_workers. The limitation is that for Gather
> node we cannot request more workers than are available in bgworkers pool
> (where number of free workers is always <= max_parallel_workers). Thus,
> limitation actually exists only for bgworkers pool, on which other parallel
> operations depend. In particular, whatever values we set for the
> autovacuum_max_parallel_workers parameter, it always will depend only
> on bgworkers pool.
Right, parallel workers are actually taken from bgworkers pool.
>
> I'll give in to your opinion and add a limitation by max_parallel_workers.
> But I still don't understand where the point is in explicit limitation by
> max_parallel_workers, if we already have this limitation implicitly?
> It seems a bit redundant for me. I hope I've conveyed my point correctly.
max_worker_processes controls the number of available bgworkers in the
database cluster and bg workers are used for parallel queries, logical
replication, or any other extensions as well. Also, it requires a
server restart to change. max_parallel_workers controls "how many
bgworkers can be used for parallel queries in total?" and is a
PGC_USERSET parameter. I think it's easier for users to tune parallel
query related parameters since all bgworkers for parallel queries
(i.e., parallel workers) are taken from max_parallel_workers pool. For
example, if users want to disable all parallel queries, they can do
that by setting max_parallel_workers to 0. If parallel vacuum workers
for autovacuums are taken from max_worker_processes pool (i.e.,
without max_paralle_workers limit), users would need to set both
max_parallel_workers and autovacuum_max_parallel_workers to 0.
> --
>
> > + /*
> > + * If 'true' then we are running parallel autovacuum. Otherwise, we are
> > + * running parallel maintenence VACUUM.
> > + */
> > + bool am_parallel_autovacuum;
> >
> > How about renaming it to use_shared_delay_params? I think it conveys
> > better what the field is used for.
>
> I think that we should leave this name, because in the future some other
> behavior differences may occur between manual VACUUM and autovacuum.
> If so, we will already have an "am_autovacuum" field which we can use in
> the code.
> The existing logic with the "am_autovacuum" name is also LGTM - we should
> use shared delay params only because we are running parallel autovacuum.
It may occur but we can change the field name when it really comes.
I'm slightly concerned that we've been using am_xxx variables in a
different way. For instance, am_walsender is a global variable that is
set to true only in wal sender processes. Also we have a bunch of
AmXXProcess() macros that checks the global variable MyBackendType, to
check the kinds of the current process. That is, the subject of 'am'
is typically the process, I guess. On the other hand,
am_parallel_autovacuum is stored in DSM space and indicates whether a
parallel vacuum is invoked by manual VACUUM or autovacuum.
>
> > Truncating all logs every after test would decrease the debuggability
> > much. We can pass the offset as the start point to wait for the
> > contents.
> >
>
> I've combined two of your above comments purposely. I agree that truncating
> all logs is a bad decision and we need to solve this in a different way. But the
> problem will occur If we want to 1) use logging instead of a test-only function
> and 2) use offsets as the start point to wait for the contents in the logfile.
>
> Imagine that we (using the described approach) need to wait until the end of
> parallel index processing and determine the current number of free parallel
> workers.
>
> IIUC, It'll look like this :
> wait_for_av_log("autovacuum processing finished");
> wait_for_av_log("number of free workers = N");
>
> But when we call wait_for_av_log first time, we will advance "offset" to the
> end of logfile and thus we will miss the "number of free workers = N". The
> only way to avoid it is to write a function that will determine the exact
> position of "autovacuum processing finished" in the logfile. Isn't it too much?
>
> I think that using wait_for_av_log("autovacuum processing finished"); +
> SELECT get_parallel_autovacuum_free_workers(); will be much more
> demonstrably and simply.
>
> Moreover, the AutoVacuumGetFreeParallelWorkers function doesn't
> seem completely useless in isolation from tests. I suggest leaving
> this function and its usage in the tests. I can remove the "For testing
> purpose only!" comment, so everyone will be free to use this function
> in the future.
Agreed. The updated test scenario looks reasonable to me.
>
> 1)
> Test 5 can be stabilized as follows :
> We can attach to the "autovacuum-start-parallel-vacuum" injection point in
> the "wait" mode. Thereby when we terminate the first a/v leader, we are
> guaranteed that no other a/v leader will reach release/reserve functions.
> And then we are free to call the get_parallel_autovacuum_free_workers
> function. I'll additionally describe this logic in the test.
>
> 2)
> In the test 4 I found another problem : when a/v leader errors out, it will
> exit() pretty soon. And during exit() it will call the before_shmem_exit hook.
> Thus, we cannot be sure that parallel workers has been released exactly
> in the try/catch block. In order to guarantee it, I think that we should log
> something inside the try/catch block. I added a pretty controversial loggin
> code for it, but it is the best I came up with.
>
> In the test 4 the above idea will look something like this:
> $log_start = $node->wait_for_log(
> qr/error triggered for injection point / .
> qr/autovacuum-leader-before-indexes-processing/,
> $log_start
> );
> $log_start = $node->wait_for_log(
> qr/2 parallel autovacuum workers has been released after occured error/,
> $log_start
> );
>
> Above I described a problem that may occur when we advance
> "logfile offset" too far after the first wait_for_log call. Here, this problem
> doesn't occur because the autovacuum launcher infinitely tries to
> vacuum the table, so other "N workers released" messages occur.
If we write the log "%d parallel autovacuum workers have been
released" in AutoVacuumReleaseParallelWorkres(), can we simplify both
tests (4 and 5) further?
I've reviewed all patches. The 0001 patch looks good to me.
0002 patch:
+ /* Worker usage stats for
parallel autovacuum. */
+ appendStringInfo(&buf,
+
_("parallel index vacuum: %d workers were planned, %d workers were
reserved and %d workers were launched in total\n"),
+
vacrel->workers_usage.vacuum.nplanned,
+
vacrel->workers_usage.vacuum.nreserved,
+
vacrel->workers_usage.vacuum.nlaunched);
These log messages need to take care of plural forms but it seems to
be too long if we use errmsg_plural() for each number. So how about
something like:
parallel workers: index: %d planned, %d reserved, %d launched in total
parallel workers: cleanup %d planned, %d reserved, %d launched
(Index cleanup is executed at most once so we don't need "in total" in
the message.)
0003 patch:
+typedef struct CostParamsData
+{
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} CostParamsData;
The name CostParamsData sounds too generic and I guess it could
conflict with optimizer-related struct names in the future. How about
renaming it to VacuumDelayParams?
---
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+
+ shared_params_data = pv_shared_cost_params->params_data;
+
+ VacuumCostDelay = shared_params_data.cost_delay;
+ VacuumCostLimit = shared_params_data.cost_limit;
+ VacuumCostPageDirty = shared_params_data.cost_page_dirty;
+ VacuumCostPageHit = shared_params_data.cost_page_hit;
+ VacuumCostPageMiss = shared_params_data.cost_page_miss;
+
+ SpinLockRelease(&pv_shared_cost_params->mutex);
If we copy the shared values in pv_shared_cost_params, we should
release the spinlock earlier, i.e., before updating VacuumCostXXX
variables. But I don't think we would even need to set these values in
the local variables in this case as updating 4 local variables is
fairly cheap.
---
+ FillCostParamsData(&local_params_data);
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+
+ if (CostParamsDataEqual(pv_shared_cost_params->params_data,
+ local_params_data))
+ {
IIUC it stores cost-based vacuum delay parameters into the
local_params_data only for using CostParamsDataEqual() macro. I think
it's better to directly compare values in pv_shared_cost_params and
the cost-based vacuum delay parameters.
0004 patch:
+ if (nworkers > 0)
+
INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
I think it's better to use #ifdef USE_INJECTION_POINTS here.
---
+#ifdef USE_INJECTION_POINTS
+/*
+ * Log values of the related to cost-based delay parameters. It is used for
s/values of the related to/values related to/
---
+ * testing purpose.
+ */
+static void
+parallel_vacuum_report_cost_based_params(void)
+{
+ StringInfoData buf;
+
+ /* Simulate config reload during normal processing */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+ vacuum_delay_point(false);
+ pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
Calling vacuum_delay_point() here feels a bit arbitrary to me. Since
parallel vacuum workers are calling
parallel_vacuum_report_cost_based_params() after
parallel_vacuum_process_safe_indexes(), I think we don't necessarily
call vacuum_delay_point() here.
---
+ appendStringInfo(&buf, "Vacuum cost-based delay parameters of
parallel worker:\n");
+ appendStringInfo(&buf, "vacuum_cost_limit = %d\n",vacuum_cost_limit);
+ appendStringInfo(&buf, "vacuum_cost_delay = %g\n", vacuum_cost_delay);
+ appendStringInfo(&buf, "vacuum_cost_page_miss = %d\n",
VacuumCostPageMiss);
+ appendStringInfo(&buf, "vacuum_cost_page_dirty = %d\n",
VacuumCostPageDirty);
+ appendStringInfo(&buf, "vacuum_cost_page_hit = %d\n",
VacuumCostPageHit);
I'd write these messages directly in elog() instead of using
StringInfoData, which is simpler and can save palloc()/pfree().
---
+ ereport(DEBUG2, errmsg("%s", buf.data));
Let's use elog() instead of ereport().
---
+# Create role with pg_signal_autovacuum_worker for terminating
autovacuum worker.
+$node->safe_psql('postgres', qq{
+ CREATE ROLE regress_worker_role;
+ GRANT pg_signal_autovacuum_worker TO regress_worker_role;
+ SET ROLE regress_worker_role;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
These two safe_psql calls use separate connections, meaning that
pg_terminate_backend() is executed by the superuser rather than
regress_worker_role. I think we don't need to create the
regrss_worker_role and we can use the superuser in this test case.
---
We would add more autovacuum related tests to the test_autovacuum
directory in the future. Given that the 001_basic.pl focuses on
parallel vacuum tests, how about renaming it to 001_parallel_vacuum.pl
or something?
> This time I'll try something experimental - besides the patches I'll also
> post differences between corresponding patches from v20 and v21.
> I.e. you can apply v20--v21-diff-for-0001 on the v20-0001 patch and
> get the v21-0001 patch. There are a lot of changes, so I guess it will
> help you during review. Please, let me know whether it is useful for you.
It was helpful to easily see the changes from the previous version. Thank you!
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-02-27 13:49 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-02-27 13:49 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Thu, Feb 26, 2026 at 6:59 AM Masahiko Sawada <[email protected]> wrote:
>
> For example, if users want to disable all parallel queries, they can do
> that by setting max_parallel_workers to 0. If parallel vacuum workers
> for autovacuums are taken from max_worker_processes pool (i.e.,
> without max_paralle_workers limit), users would need to set both
> max_parallel_workers and autovacuum_max_parallel_workers to 0.
>
This is kinda off-topic already, but I really want to clarify this question.
If parallel a/v workers are not limited by max_parallel_workers and the
user wants to disable all parallel operations, it is still enough to set
max_parallel_workers to 0. In this case parallel a/v could not acquire any
workers from bgworkers pool, and thus the user's goal is reached (and there
is no need to set autovacuum_max_parallel_workers to 0).
**Comments on the 0002 patch**
>
> + /* Worker usage stats for
> parallel autovacuum. */
> + appendStringInfo(&buf,
> +
> _("parallel index vacuum: %d workers were planned, %d workers were
> reserved and %d workers were launched in total\n"),
> +
> vacrel->workers_usage.vacuum.nplanned,
> +
> vacrel->workers_usage.vacuum.nreserved,
> +
> vacrel->workers_usage.vacuum.nlaunched);
>
> These log messages need to take care of plural forms but it seems to
> be too long if we use errmsg_plural() for each number. So how about
> something like:
>
> parallel workers: index: %d planned, %d reserved, %d launched in total
> parallel workers: cleanup %d planned, %d reserved, %d launched
>
> (Index cleanup is executed at most once so we don't need "in total" in
> the message.)
Oh, I forgot about plural form preservation. Agree with your suggestion.
**Comments on the 0003 patch**
>
> +typedef struct CostParamsData
> +{
> + double cost_delay;
> + int cost_limit;
> + int cost_page_dirty;
> + int cost_page_hit;
> + int cost_page_miss;
> +} CostParamsData;
>
> The name CostParamsData sounds too generic and I guess it could
> conflict with optimizer-related struct names in the future. How about
> renaming it to VacuumDelayParams?
I agree with the idea to rename this structure. But maybe we should rename
it to "VacuumCostParams"? This name conveys the contents of the structure
better, because enabling these parameters is called "VacuumCostActive".
> + SpinLockAcquire(&pv_shared_cost_params->mutex);
> +
> + shared_params_data = pv_shared_cost_params->params_data;
> +
> + VacuumCostDelay = shared_params_data.cost_delay;
> + VacuumCostLimit = shared_params_data.cost_limit;
> + VacuumCostPageDirty = shared_params_data.cost_page_dirty;
> + VacuumCostPageHit = shared_params_data.cost_page_hit;
> + VacuumCostPageMiss = shared_params_data.cost_page_miss;
> +
> + SpinLockRelease(&pv_shared_cost_params->mutex);
>
> If we copy the shared values in pv_shared_cost_params, we should
> release the spinlock earlier, i.e., before updating VacuumCostXXX
> variables. But I don't think we would even need to set these values in
> the local variables in this case as updating 4 local variables is
> fairly cheap.
>
Do you mean that we can release spinlock because we already copied the values
from the shared state to the local variable "shared_params_data"? I added this
variable as an alias for the long string "pv_shared_cost_params->params_data"
and I guess that compiler will get rid of it.
But now it doesn't seem like a good solution to me anymore. I'll get rid of
the local variable and copy the values directly from the shared state
(under spinlock).
> ---
> + FillCostParamsData(&local_params_data);
> + SpinLockAcquire(&pv_shared_cost_params->mutex);
> +
> + if (CostParamsDataEqual(pv_shared_cost_params->params_data,
> + local_params_data))
> + {
>
> IIUC it stores cost-based vacuum delay parameters into the
> local_params_data only for using CostParamsDataEqual() macro. I think
> it's better to directly compare values in pv_shared_cost_params and
> the cost-based vacuum delay parameters.
I agree.
> > > How about renaming it to use_shared_delay_params? I think it conveys
> > > better what the field is used for.
> >
> > I think that we should leave this name, because in the future some other
> > behavior differences may occur between manual VACUUM and autovacuum.
> > If so, we will already have an "am_autovacuum" field which we can use in
> > the code.
> > The existing logic with the "am_autovacuum" name is also LGTM - we should
> > use shared delay params only because we are running parallel autovacuum.
>
> It may occur but we can change the field name when it really comes.
>
> I'm slightly concerned that we've been using am_xxx variables in a
> different way. For instance, am_walsender is a global variable that is
> set to true only in wal sender processes. Also we have a bunch of
> AmXXProcess() macros that checks the global variable MyBackendType, to
> check the kinds of the current process. That is, the subject of 'am'
> is typically the process, I guess. On the other hand,
> am_parallel_autovacuum is stored in DSM space and indicates whether a
> parallel vacuum is invoked by manual VACUUM or autovacuum.
Yeah, I agree that "am_xxx" is not the best choice.
What about a simple "bool is_autovacuum"?
**Comments on the 0004 patch**
> If we write the log "%d parallel autovacuum workers have been
> released" in AutoVacuumReleaseParallelWorkres(), can we simplify both
> tests (4 and 5) further?
>
It won't help the 4th test, because ReleaseParallelWorkers is called
due to both ERROR and shmem_exit, but we want to be sure that
workers are released in the try/catch block (i.e. before the shmem_exit).
I thought that we could pass some additional info to the
"ReleaseAllParallelWorkers" such as "bool error_occured", but I decided
not to do so.
Also, I don't know whether the 5th test needs this log at all, because in
the end we are checking the number of free parallel workers. If a killed
a/v leader doesn't release parallel workers, we'll notice it.
> + if (nworkers > 0)
> +
> INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
>
> I think it's better to use #ifdef USE_INJECTION_POINTS here.
>
Agree. I'll also fix it in vacuumlazy.c
> +#ifdef USE_INJECTION_POINTS
> +/*
> + * Log values of the related to cost-based delay parameters. It is used for
>
> s/values of the related to/values related to/
>
OK
> + * testing purpose.
> + */
> +static void
> +parallel_vacuum_report_cost_based_params(void)
> +{
> + StringInfoData buf;
> +
> + /* Simulate config reload during normal processing */
> + pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
> + vacuum_delay_point(false);
> + pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
>
> Calling vacuum_delay_point() here feels a bit arbitrary to me. Since
> parallel vacuum workers are calling
> parallel_vacuum_report_cost_based_params() after
> parallel_vacuum_process_safe_indexes(), I think we don't necessarily
> call vacuum_delay_point() here.
>
Sure! It is left from the previous implementation of the test. I'll remove
this call.
> + appendStringInfo(&buf, "Vacuum cost-based delay parameters of
> parallel worker:\n");
> + appendStringInfo(&buf, "vacuum_cost_limit = %d\n",vacuum_cost_limit);
> + appendStringInfo(&buf, "vacuum_cost_delay = %g\n", vacuum_cost_delay);
> + appendStringInfo(&buf, "vacuum_cost_page_miss = %d\n",
> VacuumCostPageMiss);
> + appendStringInfo(&buf, "vacuum_cost_page_dirty = %d\n",
> VacuumCostPageDirty);
> + appendStringInfo(&buf, "vacuum_cost_page_hit = %d\n",
> VacuumCostPageHit);
>
> I'd write these messages directly in elog() instead of using
> StringInfoData, which is simpler and can save palloc()/pfree().
>
OK
> + ereport(DEBUG2, errmsg("%s", buf.data));
>
> Let's use elog() instead of ereport().
>
I suppose this is suggested because we don't want to translate error
messages of DEBUG level. Did I understand you correctly?
> +# Create role with pg_signal_autovacuum_worker for terminating
> autovacuum worker.
> +$node->safe_psql('postgres', qq{
> + CREATE ROLE regress_worker_role;
> + GRANT pg_signal_autovacuum_worker TO regress_worker_role;
> + SET ROLE regress_worker_role;
> +});
> +
> +$node->safe_psql('postgres', qq{
> + SELECT pg_terminate_backend('$av_pid');
> +});
>
> These two safe_psql calls use separate connections, meaning that
> pg_terminate_backend() is executed by the superuser rather than
> regress_worker_role. I think we don't need to create the
> regrss_worker_role and we can use the superuser in this test case.
>
Hm, looks like another one piece of code from my previous attempts to
implement this test. I'll remove it.
> We would add more autovacuum related tests to the test_autovacuum
> directory in the future. Given that the 001_basic.pl focuses on
> parallel vacuum tests, how about renaming it to 001_parallel_vacuum.pl
> or something?
>
Agree, I'll rename it.
> > This time I'll try something experimental - besides the patches I'll also
> > post differences between corresponding patches from v20 and v21.
> > I.e. you can apply v20--v21-diff-for-0001 on the v20-0001 patch and
> > get the v21-0001 patch. There are a lot of changes, so I guess it will
> > help you during review. Please, let me know whether it is useful for you.
>
> It was helpful to easily see the changes from the previous version. Thank you!
>
I'm glad to hear it :) I will keep this tradition alive.
Thank you very much for the review!
Please, see updated set of patches and diffs between v21 and v22.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v22-0004-Tests-for-parallel-autovacuum.patch (22.6K, 2-v22-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 68db56a95032518bf527376e152540cc11ddbb31 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v22 4/5] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 49 +++
src/backend/postmaster/autovacuum.c | 28 ++
src/include/postmaster/autovacuum.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 28 ++
src/test/modules/test_autovacuum/meson.build | 36 ++
.../t/001_parallel_autovacuum.pl | 319 ++++++++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 +
.../modules/test_autovacuum/test_autovacuum.c | 41 +++
.../test_autovacuum/test_autovacuum.control | 3 +
13 files changed, 530 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 91be2502c09..6407c10524b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -151,6 +151,7 @@
#include "storage/freespace.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -869,6 +870,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 27a6120b0e3..78ccfede031 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -306,6 +307,10 @@ static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_inde
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+#ifdef USE_INJECTION_POINTS
+static inline void parallel_vacuum_report_cost_based_params(void);
+#endif
+
/*
* Try to enter parallel mode and create a parallel context. Then initialize
* shared memory state.
@@ -918,6 +923,19 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ *
+ * This injection point is also used to wait until parallel workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1295,6 +1313,16 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * If we are parallel autovacuum worker, we can consume delay parameters
+ * during index processing (via vacuum_delay_point call). This logging
+ * allows tests to ensure this.
+ */
+ if (shared->is_autovacuum)
+ parallel_vacuum_report_cost_based_params();
+#endif
+
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
@@ -1347,3 +1375,24 @@ parallel_vacuum_error_callback(void *arg)
return;
}
}
+
+#ifdef USE_INJECTION_POINTS
+/*
+ * Log values related to cost-based vacuum delay parameters. It is used for
+ * testing purpose.
+ */
+static inline void
+parallel_vacuum_report_cost_based_params(void)
+{
+ const char *msg_format =
+ _("Parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d");
+
+ elog(DEBUG2,
+ msg_format,
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+}
+#endif
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 0d78d02bd09..7b24a5d6e67 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2495,12 +2495,20 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ int nreserved_workers = av_nworkers_reserved;
+
/*
* Parallel autovacuum can reserve parallel workers. Make sure
* that all reserved workers are released.
*/
AutoVacuumReleaseAllParallelWorkers();
+ if (nreserved_workers > 0)
+ ereport(DEBUG2,
+ (errmsg("%d parallel autovacuum workers has been released after occured error",
+ nreserved_workers),
+ errhidecontext(true)));
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -3465,6 +3473,21 @@ AutoVacuumReleaseAllParallelWorkers(void)
Assert(av_nworkers_reserved == 0);
}
+/*
+ * Get number of free autovacuum parallel workers.
+ */
+uint32
+AutoVacuumGetFreeParallelWorkers(void)
+{
+ uint32 nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_SHARED);
+ nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
+ LWLockRelease(AutovacuumLock);
+
+ return nfree_workers;
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3633,5 +3656,10 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+ ereport(DEBUG2,
+ (errmsg("number of free parallel autovacuum workers is set to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers),
+ errhidecontext(true)));
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index f3783afb51b..52be260e15f 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
extern void AutoVacuumReleaseAllParallelWorkers(void);
+extern uint32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 44c7163c1cd..937dbb64fd2 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..5ac8d87702d 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..32254c53a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..9b80d371f5c
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,319 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ });
+
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+}
+
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$log_start = $node->wait_for_log(
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ $log_start
+);
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 20, 'All parallel workers has been released by the leader');
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/Parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# Test 3:
+# Test adjustment of free parallel workers number when changing
+# autovacuum_max_parallel_workers parameter
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 1;
+ SELECT pg_reload_conf();
+});
+
+# Since 2 parallel workers already launched and will be released in the future,
+# we are expecting that :
+# 1) number of free workers will be '0' after config reload
+# 2) number of free workers will be '1' after releasing workers
+
+# Check statement (1)
+$log_start = $node->wait_for_log(
+ qr/number of free parallel autovacuum workers is set to 0 due to config reload/,
+ $log_start
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until the end of parallel processing
+$log_start = $node->wait_for_log(
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ $log_start
+);
+
+# Check statement (2)
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 1, 'Number of free parallel workers is consistent');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
+ SELECT pg_reload_conf();
+});
+
+# Test 4:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exits due to an ERROR.
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'error');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$log_start = $node->wait_for_log(
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/,
+ $log_start
+);
+
+$log_start = $node->wait_for_log(
+ qr/2 parallel autovacuum workers has been released after occured error/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 5:
+# Same as above test, but simulate situation, when leader exits due to FATAL.
+
+prepare_for_next_test($node, 5);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited and wake up the leader
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+my $av_pid = $node->safe_psql('postgres', qq{
+ SELECT pid FROM pg_stat_activity
+ WHERE backend_type = 'autovacuum worker'
+ AND wait_event = 'autovacuum-leader-before-indexes-processing'
+ LIMIT 1;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
+
+$log_start = $node->wait_for_log(
+ qr/terminating autovacuum process due to administrator command/,
+ $log_start
+);
+
+# Now it is safe to check the number of free parallel workers, because even if
+# autovacuum is trying to vacuum table in parallel mode again, the leader
+# worker cannot go any further than "autovacuum-start-parallel-vacuum" point.
+# I.e. no one can interfere and change the number of free parallel workers.
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 10, 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..e5646e0def5
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting shared autovacuum state
+ */
+
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..959629c7685
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,41 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "commands/vacuum.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_UINT32(nfree_workers);
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v22-0003-Cost-based-parameters-propagation-for-parallel-a.patch (9.8K, 3-v22-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From f535c603f11233d5ae6eb3ca441027d5196e20ee Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v22 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 23 +++-
src/backend/commands/vacuumparallel.c | 160 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 2 +
5 files changed, 186 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 03932f45c8a..70882544d05 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2430,8 +2430,21 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Possibly update cost-based delay parameters.
+ *
+ * Do it before checking VacuumCostActive, because its value might be
+ * changed after calling this function.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2445,6 +2458,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * If we are parallel autovacuum leader and some of cost-based
+ * parameters had changed, let other parallel workers know.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 86d9f2b74c9..27a6120b0e3 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -53,6 +53,59 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Helper for the PVSharedCostParams structure (see below), to avoid
+ * repetition.
+ */
+typedef struct VacuumCostParams
+{
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} VacuumCostParams;
+
+#define FillVacCostParams(cost_params) \
+ (cost_params)->cost_delay = vacuum_cost_delay; \
+ (cost_params)->cost_limit = vacuum_cost_limit; \
+ (cost_params)->cost_page_dirty = VacuumCostPageDirty; \
+ (cost_params)->cost_page_hit = VacuumCostPageHit; \
+ (cost_params)->cost_page_miss = VacuumCostPageMiss
+
+#define VacCostParamsEquals(params) \
+ (vacuum_cost_delay == (params).cost_delay && \
+ vacuum_cost_limit == (params).cost_limit && \
+ VacuumCostPageDirty == (params).cost_page_dirty && \
+ VacuumCostPageHit == (params).cost_page_hit && \
+ VacuumCostPageMiss == (params).cost_page_miss)
+
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * Each time leader worker updates its parameters, it must increase
+ * generation. Every parallel worker keeps the generation
+ * (shared_params_local_generation) at which it had last time received
+ * parameters from the leader.
+ *
+ * It is enough for worker to compare it's local_generation with the field
+ * below to determine whether it needs to receive new parameters' values.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /*
+ * Copies of the corresponding cost-based vacuum delay parameters from
+ * autovacuum leader process.
+ */
+ VacuumCostParams params_data;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -122,6 +175,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -224,6 +289,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments for the PVSharedCostParams structure for the explanation. */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -395,6 +465,17 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ if (shared->is_autovacuum)
+ {
+ FillVacCostParams(&shared->cost_params.params_data);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -539,6 +620,82 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
&wusage->cleanup);
}
+/*
+ * If we are parallel *autovacuum* worker, check whether related to cost-based
+ * vacuum delay parameters had changed in the leader worker. If so,
+ * corresponding parameters will be updated to the values which leader worker
+ * is operating on.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+
+ VacuumCostDelay = pv_shared_cost_params->params_data.cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->params_data.cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->params_data.cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->params_data.cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->params_data.cost_page_miss;
+
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Function to be called from parallel autovacuum leader in order to propagate
+ * some cost-based vacuum delay parameters to the supportive workers.
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+
+ if (VacCostParamsEquals(pv_shared_cost_params->params_data))
+ {
+ /*
+ * We don't need to update shared cost-based vacuum delay params if
+ * they haven't changed.
+ */
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+ return;
+ }
+
+ FillVacCostParams(&pv_shared_cost_params->params_data);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increase generation of the parameters, i.e. let parallel workers know
+ * that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1105,6 +1262,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f40abe90ed5..0d78d02bd09 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1690,7 +1690,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index d3dc4e8cc67..b10829a9379 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -423,6 +423,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index ae1047ddf5d..20fe34f8cc7 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2069,6 +2069,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkersUsage
PVWorkersStats
PX_Alias
@@ -3249,6 +3250,7 @@ VacAttrStatsP
VacDeadItemsInfo
VacErrPhase
VacOptValue
+VacuumCostParams
VacuumParams
VacuumRelation
VacuumStmt
--
2.43.0
[text/x-patch] v22-0002-Logging-for-parallel-autovacuum.patch (10.1K, 4-v22-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From e8ecbc65ef61acdc8d3184ec93ac4f4877358fc1 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v22 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 61 ++++++++++++++++++++++++++-
src/backend/commands/vacuumparallel.c | 29 ++++++++++---
src/include/commands/vacuum.h | 28 +++++++++++-
src/tools/pgindent/typedefs.list | 3 ++
4 files changed, 111 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 4be267ff657..91be2502c09 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -340,6 +340,12 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -778,6 +784,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->vm_new_visible_frozen_pages = 0;
vacrel->vm_new_frozen_pages = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1120,6 +1131,50 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ /* Stats for vacuum phase of index vacuuming. */
+
+ if (AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum. */
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d reserved, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nreserved,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ else
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL). */
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ }
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ /* Stats for cleanup phase of index vacuuming. */
+
+ if (AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum. */
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d reserved, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nreserved,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ else
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL). */
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2664,7 +2719,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3097,7 +3153,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index d3e0c32b7ee..86d9f2b74c9 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -521,7 +522,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +535,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -618,7 +621,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -655,13 +658,23 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
*/
if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ {
AutoVacuumReserveParallelWorkers(&nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nreserved += nworkers;
+ }
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -728,6 +741,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..d3dc4e8cc67 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,28 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ */
+typedef struct PVWorkersStats
+{
+ int nplanned; /* # of parallel workers we are planned to
+ * launch */
+ int nreserved; /* for autovacuum only - # of parallel workers
+ * we have managed to reserve */
+ int nlaunched; /* # of launched parallel workers */
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched, reserved
+ * and planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +416,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 241945734ec..ae1047ddf5d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2069,6 +2069,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
@@ -2407,6 +2409,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
[text/x-patch] v22-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 5-v22-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From e192de925dcfc932d1c223c051b71835dceded0e Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v22 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f670e2d4c31..07139ec7ff2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9380,6 +9381,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-parallel-workers"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..4894de021cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v22-0001-Parallel-autovacuum.patch (19.4K, 6-v22-0001-Parallel-autovacuum.patch)
download | inline diff:
From d312736690ffe1df7ae73c73dbb2ef334dfa3249 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v22 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 239 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..9459a010cc3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c3b3c9ea21a..d3e0c32b7ee 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseAllParallelWorkers();
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6fde740465f..f40abe90ed5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,13 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +292,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -361,6 +372,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -759,6 +771,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -775,6 +789,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1379,6 +1402,16 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -2275,6 +2308,12 @@ do_autovacuum(void)
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that all
+ * reserved workers are released even after FATAL error.
+ */
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Perform operations on collected tables.
*/
@@ -2456,6 +2495,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2856,8 +2901,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3334,6 +3383,88 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Reserves parallel workers for autovacuum.
+ *
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
+ *
+ * The caller must call AutoVacuumRelease[All]ParallelWorkers() to release the
+ * reserved workers.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader autovacuum worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* The worker must not have any reserved workers yet */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+}
+
+/*
+ * Releases the reserved parallel workers for autovacuum.
+ *
+ * This function should be used to release the parallel workers that an
+ * autovacuum worker reserved by AutoVacuumReserveParallelWorkers(). nworkers
+ * is the number of workers to release, which must not be greater than the
+ * number of workers currently reserved, av_nworkers_reserved.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Cannot release more workers than reserved */
+ Assert(nworkers <= av_nworkers_reserved);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+}
+
+/*
+ * Same as above, but this function releases all the parallel workers that
+ * this autovacuum worker reserved.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3394,6 +3525,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_parallel_workers);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3475,3 +3610,28 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ int nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d77502838c4..4a5c73a9e33 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 9507778415d..92b69c65e83 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f938cc65a3a..ef8126f3790 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -710,6 +710,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8b91bc00062..ed59a21289c 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..f3783afb51b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,6 +62,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..7c5e35a486c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v21--v22-diff-for-0003.patch (7.1K, 7-v21--v22-diff-for-0003.patch)
download | inline diff:
From 2e5ab0a4f025900a61a1e34f5d2d163b6ff23f0d Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 27 Feb 2026 14:45:22 +0700
Subject: [PATCH 1/3] fixes for 0003 patch
---
src/backend/commands/vacuumparallel.c | 74 +++++++++++++--------------
src/tools/pgindent/typedefs.list | 2 +-
2 files changed, 36 insertions(+), 40 deletions(-)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ccb3812165c..27a6120b0e3 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -57,28 +57,28 @@
* Helper for the PVSharedCostParams structure (see below), to avoid
* repetition.
*/
-typedef struct CostParamsData
+typedef struct VacuumCostParams
{
double cost_delay;
int cost_limit;
int cost_page_dirty;
int cost_page_hit;
int cost_page_miss;
-} CostParamsData;
+} VacuumCostParams;
-#define FillCostParamsData(cost_params) \
+#define FillVacCostParams(cost_params) \
(cost_params)->cost_delay = vacuum_cost_delay; \
(cost_params)->cost_limit = vacuum_cost_limit; \
(cost_params)->cost_page_dirty = VacuumCostPageDirty; \
(cost_params)->cost_page_hit = VacuumCostPageHit; \
(cost_params)->cost_page_miss = VacuumCostPageMiss
-#define CostParamsDataEqual(params_1, params_2) \
- ((params_1).cost_delay == (params_2).cost_delay && \
- (params_1).cost_limit == (params_2).cost_limit && \
- (params_1).cost_page_dirty == (params_2).cost_page_dirty && \
- (params_1).cost_page_hit == (params_2).cost_page_hit && \
- (params_1).cost_page_miss == (params_2).cost_page_miss)
+#define VacCostParamsEquals(params) \
+ (vacuum_cost_delay == (params).cost_delay && \
+ vacuum_cost_limit == (params).cost_limit && \
+ VacuumCostPageDirty == (params).cost_page_dirty && \
+ VacuumCostPageHit == (params).cost_page_hit && \
+ VacuumCostPageMiss == (params).cost_page_miss)
/*
* Struct for cost-based vacuum delay related parameters to share among an
@@ -99,8 +99,11 @@ typedef struct PVSharedCostParams
slock_t mutex; /* protects all fields below */
- /* Copies of corresponding parameters from autovacuum leader process */
- CostParamsData params_data;
+ /*
+ * Copies of the corresponding cost-based vacuum delay parameters from
+ * autovacuum leader process.
+ */
+ VacuumCostParams params_data;
} PVSharedCostParams;
/*
@@ -177,11 +180,11 @@ typedef struct PVShared
* If 'true' then we are running parallel autovacuum. Otherwise, we are
* running parallel maintenence VACUUM.
*/
- bool am_parallel_autovacuum;
+ bool is_autovacuum;
/*
- * Struct for syncing parameters between supportive parallel autovacuum
- * workers with leader worker.
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
*/
PVSharedCostParams cost_params;
} PVShared;
@@ -462,11 +465,11 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
- shared->am_parallel_autovacuum = AmAutoVacuumWorkerProcess();
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
- if (shared->am_parallel_autovacuum)
+ if (shared->is_autovacuum)
{
- FillCostParamsData(&shared->cost_params.params_data);
+ FillVacCostParams(&shared->cost_params.params_data);
pg_atomic_init_u32(&shared->cost_params.generation, 0);
SpinLockInit(&shared->cost_params.mutex);
@@ -618,10 +621,10 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
}
/*
- * If we are parallel *autovacuum* worker, check whether related to
- * cost-based delay parameters had changed in the leader worker. If
- * so, corresponding parameters will be updated to the values which
- * leader worker is operating on.
+ * If we are parallel *autovacuum* worker, check whether related to cost-based
+ * vacuum delay parameters had changed in the leader worker. If so,
+ * corresponding parameters will be updated to the values which leader worker
+ * is operating on.
*
* For non-autovacuum parallel worker this function will have no effect.
*/
@@ -629,7 +632,6 @@ void
parallel_vacuum_update_shared_delay_params(void)
{
uint32 params_generation;
- CostParamsData shared_params_data;
Assert(IsParallelWorker());
@@ -646,13 +648,11 @@ parallel_vacuum_update_shared_delay_params(void)
SpinLockAcquire(&pv_shared_cost_params->mutex);
- shared_params_data = pv_shared_cost_params->params_data;
-
- VacuumCostDelay = shared_params_data.cost_delay;
- VacuumCostLimit = shared_params_data.cost_limit;
- VacuumCostPageDirty = shared_params_data.cost_page_dirty;
- VacuumCostPageHit = shared_params_data.cost_page_hit;
- VacuumCostPageMiss = shared_params_data.cost_page_miss;
+ VacuumCostDelay = pv_shared_cost_params->params_data.cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->params_data.cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->params_data.cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->params_data.cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->params_data.cost_page_miss;
SpinLockRelease(&pv_shared_cost_params->mutex);
@@ -663,34 +663,30 @@ parallel_vacuum_update_shared_delay_params(void)
/*
* Function to be called from parallel autovacuum leader in order to propagate
- * some cost-based parameters to the supportive workers.
+ * some cost-based vacuum delay parameters to the supportive workers.
*/
void
parallel_vacuum_propagate_shared_delay_params(void)
{
- CostParamsData local_params_data;
-
Assert(AmAutoVacuumWorkerProcess());
/* Check whether we are running parallel autovacuum */
if (pv_shared_cost_params == NULL)
return;
- FillCostParamsData(&local_params_data);
SpinLockAcquire(&pv_shared_cost_params->mutex);
- if (CostParamsDataEqual(pv_shared_cost_params->params_data,
- local_params_data))
+ if (VacCostParamsEquals(pv_shared_cost_params->params_data))
{
/*
- * We don't need to update shared delay params if they haven't
- * changed.
+ * We don't need to update shared cost-based vacuum delay params if
+ * they haven't changed.
*/
SpinLockRelease(&pv_shared_cost_params->mutex);
return;
}
- FillCostParamsData(&pv_shared_cost_params->params_data);
+ FillVacCostParams(&pv_shared_cost_params->params_data);
SpinLockRelease(&pv_shared_cost_params->mutex);
/*
@@ -1266,7 +1262,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
- if (shared->am_parallel_autovacuum)
+ if (shared->is_autovacuum)
pv_shared_cost_params = &(shared->cost_params);
/* Set parallel vacuum state */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 2d6b57232e6..20fe34f8cc7 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -545,7 +545,6 @@ CopyToRoutine
CopyToState
CopyToStateData
Cost
-CostParamsData
CostSelector
Counters
CoverExt
@@ -3251,6 +3250,7 @@ VacAttrStatsP
VacDeadItemsInfo
VacErrPhase
VacOptValue
+VacuumCostParams
VacuumParams
VacuumRelation
VacuumStmt
--
2.43.0
[text/x-patch] v21--v22-diff-for-0002.patch (2.5K, 8-v21--v22-diff-for-0002.patch)
download | inline diff:
From dd5df106946a188342992f50e587f269881cacae Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 27 Feb 2026 14:03:51 +0700
Subject: [PATCH] fixes for 0002 patch
---
src/backend/access/heap/vacuumlazy.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d19e15cbcce..91be2502c09 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1139,7 +1139,7 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
{
/* Worker usage stats for parallel autovacuum. */
appendStringInfo(&buf,
- _("parallel index vacuum: %d workers were planned, %d workers were reserved and %d workers were launched in total\n"),
+ _("parallel workers: index vacuum: %d planned, %d reserved, %d launched in total\n"),
vacrel->workers_usage.vacuum.nplanned,
vacrel->workers_usage.vacuum.nreserved,
vacrel->workers_usage.vacuum.nlaunched);
@@ -1148,7 +1148,7 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
{
/* Worker usage stats for manual VACUUM (PARALLEL). */
appendStringInfo(&buf,
- _("parallel index vacuum: %d workers were planned and %d workers were launched in total\n"),
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
vacrel->workers_usage.vacuum.nplanned,
vacrel->workers_usage.vacuum.nlaunched);
}
@@ -1161,7 +1161,7 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
{
/* Worker usage stats for parallel autovacuum. */
appendStringInfo(&buf,
- _("parallel index cleanup: %d workers were planned, %d workers were reserved and %d workers were launched in total\n"),
+ _("parallel workers: index cleanup: %d planned, %d reserved, %d launched\n"),
vacrel->workers_usage.cleanup.nplanned,
vacrel->workers_usage.cleanup.nreserved,
vacrel->workers_usage.cleanup.nlaunched);
@@ -1170,7 +1170,7 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
{
/* Worker usage stats for manual VACUUM (PARALLEL). */
appendStringInfo(&buf,
- _("parallel index cleanup: %d workers were planned and %d workers were launched in total\n"),
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
vacrel->workers_usage.cleanup.nplanned,
vacrel->workers_usage.cleanup.nlaunched);
}
--
2.43.0
[text/x-patch] v21--v22-diff-for-0004.patch (6.1K, 9-v21--v22-diff-for-0004.patch)
download | inline diff:
From d38013b4abe14b69f4058337cd7231ab1150e12f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 27 Feb 2026 16:15:34 +0700
Subject: [PATCH 3/3] fixes for 0004 patch
---
src/backend/access/heap/vacuumlazy.c | 2 +
src/backend/commands/vacuumparallel.c | 38 ++++++++-----------
.../modules/test_autovacuum/t/001_basic.pl | 21 ++--------
3 files changed, 22 insertions(+), 39 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2498edcc0d5..6407c10524b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -870,11 +870,13 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
/*
* Trigger injection point, if parallel autovacuum is about to be started.
*/
if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
/*
* Call lazy_scan_heap to perform all required heap pruning, index
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 88842c5cec9..78ccfede031 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -308,7 +308,7 @@ static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_inde
static void parallel_vacuum_error_callback(void *arg);
#ifdef USE_INJECTION_POINTS
-static void parallel_vacuum_report_cost_based_params(void);
+static inline void parallel_vacuum_report_cost_based_params(void);
#endif
/*
@@ -923,6 +923,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
/*
* To be able to exercise whether all reserved parallel workers are being
* released anyway, allow injection points to trigger a failure at this
@@ -933,6 +934,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
if (nworkers > 0)
INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1317,7 +1319,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
* during index processing (via vacuum_delay_point call). This logging
* allows tests to ensure this.
*/
- if (shared->am_parallel_autovacuum)
+ if (shared->is_autovacuum)
parallel_vacuum_report_cost_based_params();
#endif
@@ -1376,29 +1378,21 @@ parallel_vacuum_error_callback(void *arg)
#ifdef USE_INJECTION_POINTS
/*
- * Log values of the related to cost-based delay parameters. It is used for
+ * Log values related to cost-based vacuum delay parameters. It is used for
* testing purpose.
*/
-static void
+static inline void
parallel_vacuum_report_cost_based_params(void)
{
- StringInfoData buf;
-
- /* Simulate config reload during normal processing */
- pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
- vacuum_delay_point(false);
- pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
-
- initStringInfo(&buf);
-
- appendStringInfo(&buf, "Vacuum cost-based delay parameters of parallel worker:\n");
- appendStringInfo(&buf, "vacuum_cost_limit = %d\n",vacuum_cost_limit);
- appendStringInfo(&buf, "vacuum_cost_delay = %g\n", vacuum_cost_delay);
- appendStringInfo(&buf, "vacuum_cost_page_miss = %d\n", VacuumCostPageMiss);
- appendStringInfo(&buf, "vacuum_cost_page_dirty = %d\n", VacuumCostPageDirty);
- appendStringInfo(&buf, "vacuum_cost_page_hit = %d\n", VacuumCostPageHit);
-
- ereport(DEBUG2, errmsg("%s", buf.data));
- pfree(buf.data);
+ const char *msg_format =
+ _("Parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d");
+
+ elog(DEBUG2,
+ msg_format,
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
#endif
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
index b3d22361dcf..9b80d371f5c 100644
--- a/src/test/modules/test_autovacuum/t/001_basic.pl
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -109,8 +109,7 @@ $node->safe_psql('postgres', qq{
# Wait until the parallel autovacuum on table is completed. At the same time,
# we check that the required number of parallel workers has been started.
$log_start = $node->wait_for_log(
- qr/parallel index vacuum: 2 workers were planned, / .
- qr/2 workers were reserved and 2 workers were launched in total/,
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
$log_start
);
@@ -162,12 +161,8 @@ $node->wait_for_event(
# Check whether parallel worker successfully updated all parameters during
# index processing
$log_start = $node->wait_for_log(
- qr/Vacuum cost-based delay parameters of parallel worker:\n/ .
- qr/\tvacuum_cost_limit = 500\n/ .
- qr/\tvacuum_cost_delay = 2\n/ .
- qr/\tvacuum_cost_page_miss = 10\n/ .
- qr/\tvacuum_cost_page_dirty = 10\n/ .
- qr/\tvacuum_cost_page_hit = 10\n/,
+ qr/Parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
$log_start
);
@@ -219,8 +214,7 @@ $node->safe_psql('postgres', qq{
# Wait until the end of parallel processing
$log_start = $node->wait_for_log(
- qr/parallel index vacuum: 2 workers were planned, / .
- qr/2 workers were reserved and 2 workers were launched in total/,
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
$log_start
);
@@ -296,13 +290,6 @@ my $av_pid = $node->safe_psql('postgres', qq{
LIMIT 1;
});
-# Create role with pg_signal_autovacuum_worker for terminating autovacuum worker.
-$node->safe_psql('postgres', qq{
- CREATE ROLE regress_worker_role;
- GRANT pg_signal_autovacuum_worker TO regress_worker_role;
- SET ROLE regress_worker_role;
-});
-
$node->safe_psql('postgres', qq{
SELECT pg_terminate_backend('$av_pid');
});
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-02-28 01:56 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-02-28 01:56 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Fri, Feb 27, 2026 at 5:49 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Thu, Feb 26, 2026 at 6:59 AM Masahiko Sawada <[email protected]> wrote:
> >
> > For example, if users want to disable all parallel queries, they can do
> > that by setting max_parallel_workers to 0. If parallel vacuum workers
> > for autovacuums are taken from max_worker_processes pool (i.e.,
> > without max_paralle_workers limit), users would need to set both
> > max_parallel_workers and autovacuum_max_parallel_workers to 0.
> >
>
> This is kinda off-topic already, but I really want to clarify this question.
>
> If parallel a/v workers are not limited by max_parallel_workers and the
> user wants to disable all parallel operations, it is still enough to set
> max_parallel_workers to 0. In this case parallel a/v could not acquire any
> workers from bgworkers pool, and thus the user's goal is reached (and there
> is no need to set autovacuum_max_parallel_workers to 0).
IIUC earlier patches defined autovacuum_max_parallel_workers with the
limit by max_worker_processes. Suppose we set:
- max_worker_processes = 8
- autovacuum_max_parallel_workers = 4
- max_parallel_workers = 4
If we want to disable all parallel operations, we would need to set
max_parallel_workers to 0 as well as either
autovacuum_max_parallel_workers to 0, no? This is because if we set
only max_parallel_workers to 0, autovacuum workers still can take
parallel vacuum workers from the max_worker_processes pool. I might be
missing something though.
>
> **Comments on the 0003 patch**
>
> >
> > +typedef struct CostParamsData
> > +{
> > + double cost_delay;
> > + int cost_limit;
> > + int cost_page_dirty;
> > + int cost_page_hit;
> > + int cost_page_miss;
> > +} CostParamsData;
> >
> > The name CostParamsData sounds too generic and I guess it could
> > conflict with optimizer-related struct names in the future. How about
> > renaming it to VacuumDelayParams?
>
> I agree with the idea to rename this structure. But maybe we should rename
> it to "VacuumCostParams"? This name conveys the contents of the structure
> better, because enabling these parameters is called "VacuumCostActive".
+1
>
> > + SpinLockAcquire(&pv_shared_cost_params->mutex);
> > +
> > + shared_params_data = pv_shared_cost_params->params_data;
> > +
> > + VacuumCostDelay = shared_params_data.cost_delay;
> > + VacuumCostLimit = shared_params_data.cost_limit;
> > + VacuumCostPageDirty = shared_params_data.cost_page_dirty;
> > + VacuumCostPageHit = shared_params_data.cost_page_hit;
> > + VacuumCostPageMiss = shared_params_data.cost_page_miss;
> > +
> > + SpinLockRelease(&pv_shared_cost_params->mutex);
> >
> > If we copy the shared values in pv_shared_cost_params, we should
> > release the spinlock earlier, i.e., before updating VacuumCostXXX
> > variables. But I don't think we would even need to set these values in
> > the local variables in this case as updating 4 local variables is
> > fairly cheap.
> >
>
> Do you mean that we can release spinlock because we already copied the values
> from the shared state to the local variable "shared_params_data"?
Yes.
> I added this
> variable as an alias for the long string "pv_shared_cost_params->params_data"
> and I guess that compiler will get rid of it.
>
> But now it doesn't seem like a good solution to me anymore. I'll get rid of
> the local variable and copy the values directly from the shared state
> (under spinlock).
Thanks.
>
> > > > How about renaming it to use_shared_delay_params? I think it conveys
> > > > better what the field is used for.
> > >
> > > I think that we should leave this name, because in the future some other
> > > behavior differences may occur between manual VACUUM and autovacuum.
> > > If so, we will already have an "am_autovacuum" field which we can use in
> > > the code.
> > > The existing logic with the "am_autovacuum" name is also LGTM - we should
> > > use shared delay params only because we are running parallel autovacuum.
> >
> > It may occur but we can change the field name when it really comes.
> >
> > I'm slightly concerned that we've been using am_xxx variables in a
> > different way. For instance, am_walsender is a global variable that is
> > set to true only in wal sender processes. Also we have a bunch of
> > AmXXProcess() macros that checks the global variable MyBackendType, to
> > check the kinds of the current process. That is, the subject of 'am'
> > is typically the process, I guess. On the other hand,
> > am_parallel_autovacuum is stored in DSM space and indicates whether a
> > parallel vacuum is invoked by manual VACUUM or autovacuum.
>
> Yeah, I agree that "am_xxx" is not the best choice.
> What about a simple "bool is_autovacuum"?
+1
>
> **Comments on the 0004 patch**
>
> > If we write the log "%d parallel autovacuum workers have been
> > released" in AutoVacuumReleaseParallelWorkres(), can we simplify both
> > tests (4 and 5) further?
> >
>
> It won't help the 4th test, because ReleaseParallelWorkers is called
> due to both ERROR and shmem_exit, but we want to be sure that
> workers are released in the try/catch block (i.e. before the shmem_exit).
We already call AutoVacuumReleaseAllParallelWorker() in the PG_CATCH()
block in do_autovacuum(). If we write the log in
AutoVacuumReleaseParallelWorkers(), the tap test is able to check the
log, no?
> Also, I don't know whether the 5th test needs this log at all, because in
> the end we are checking the number of free parallel workers. If a killed
> a/v leader doesn't release parallel workers, we'll notice it.
If we can check the log written at process shutdown time, I think we
can somewhat simplify the test 5 logic by not attaching
'autovacuum-start-parallel-vacuum' injection point.
1. attach 'autovacuum-leader-before-indexes-processing' injection point.
2. wait for an av worker to stop at the injection point.
3. terminate the av worker.
4. verify from the log if the workers have been released.
5. disable parallel autovacuum.
6. check the free workers (should be 10).
Step 5 and 6 seems to be optional though.
>
> > + ereport(DEBUG2, errmsg("%s", buf.data));
> >
> > Let's use elog() instead of ereport().
> >
>
> I suppose this is suggested because we don't want to translate error
> messages of DEBUG level. Did I understand you correctly?
We use ereport() for DEBUG level messages in many places actually. I
suggested it because this message is not a user-facing message.
> Please, see updated set of patches and diffs between v21 and v22.
Thank you for updating the patches! Here are review comments on the
v22 patch set.
* 0001 patch:
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
I'm a bit concerned that the above description doesn't explain what
number of parallel vacuum workers are used in >0 as it mentioned only
the maximum number. How about rewording it to:
Target number of parallel autovacuum workers. -1 by default disables
parallel vacuum during autovacuum. 0 means choose the parallel degree
based on the number of indexes.
* 0002 patch:
+ PVWorkersUsage workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
Let's insert a new line between the new line and the existing line.
---
+
+ if (AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum. */
+ appendStringInfo(&buf,
+ _("parallel workers: index
vacuum: %d planned, %d reserved, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nreserved,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ else
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL). */
+ appendStringInfo(&buf,
+ _("parallel workers: index
vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ }
These comments are very obvious so I don't think we need them.
Instead, I think it would be good to explain why we don't need to
report "reserved" numbers in the manual vacuum cases.
---
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ /* Stats for vacuum phase of index vacuuming. */
and
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ /* Stats for cleanup phase of index vacuuming. */
+
I don't think we need these comments (the second one has a typo
though) as it's obvious.
---
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long
num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
Please add a brief description of wusage to the function comment. We
can add comments to both parallel_vacuum_bulkldel_all_indexes() and
parallel_vacuum_cleanup_all_indexes() or only
parallel_vacuum_process_all_indexes().
---
@@ -2070,6 +2070,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
@@ -2408,6 +2410,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
PVWorkersUsage is added twice
* 0003 patch:
+#define VacCostParamsEquals(params) \
+ (vacuum_cost_delay == (params).cost_delay && \
+ vacuum_cost_limit == (params).cost_limit && \
+ VacuumCostPageDirty == (params).cost_page_dirty && \
+ VacuumCostPageHit == (params).cost_page_hit && \
+ VacuumCostPageMiss == (params).cost_page_miss)
I'm not sure this macro helps reduce lines of code or improve
readability as it's used only once and it's slightly unnatural to me
that *Equals macro takes only one argument.
* 0004 patch:
+#include "commands/vacuum.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
We can remove some unnecessary header includes. ISTM we need only
fmgr.h, autovacuum.h, and injection_point.h.
---
+ const char *msg_format =
+ _("Parallel autovacuum worker cost params: cost_limit=%d,
cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d,
cost_page_hit=%d");
+
I don't think we need the translation for this message as it's not a
user-facing one.
We don't capitalize the first letter in the error message.
---
+ ereport(DEBUG2,
+ (errmsg("number of free parallel autovacuum workers is set
to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers),
+ errhidecontext(true)));
Why do we need to add errhidecontext(true) here?
---
+ 'tests': [
+ 't/001_basic.pl',
+ ],
Need to be updated to the new filename.
---
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
Please update the copyright years.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-01 14:46 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-01 14:46 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Sat, Feb 28, 2026 at 8:57 AM Masahiko Sawada <[email protected]> wrote:
>
> IIUC earlier patches defined autovacuum_max_parallel_workers with the
> limit by max_worker_processes. Suppose we set:
>
> - max_worker_processes = 8
> - autovacuum_max_parallel_workers = 4
> - max_parallel_workers = 4
>
> If we want to disable all parallel operations, we would need to set
> max_parallel_workers to 0 as well as either
> autovacuum_max_parallel_workers to 0, no? This is because if we set
> only max_parallel_workers to 0, autovacuum workers still can take
> parallel vacuum workers from the max_worker_processes pool. I might be
> missing something though.
>
Even if av_max_parallel_workers is limited by max_worker_processes,
it is enough to set max_parallel_workers to 0 to disable parallel
autovacuum.
When a/v leader wants to create supportive workers, it calls
"RegisterDynamicBackgroundWorker" function, which contain following
logic :
/*
* If this is a parallel worker, check whether there are already too many
* parallel workers; if so, don't register another one.
*/
if (parallel && (BackgroundWorkerData->parallel_register_count -
BackgroundWorkerData->parallel_terminate_count) >=
max_parallel_workers)
{
....
}
Thus, a/v leader cannot launch any workers if max_parallel_workers is set to 0.
> > > If we write the log "%d parallel autovacuum workers have been
> > > released" in AutoVacuumReleaseParallelWorkres(), can we simplify both
> > > tests (4 and 5) further?
> > >
> >
> > It won't help the 4th test, because ReleaseParallelWorkers is called
> > due to both ERROR and shmem_exit, but we want to be sure that
> > workers are released in the try/catch block (i.e. before the shmem_exit).
>
> We already call AutoVacuumReleaseAllParallelWorker() in the PG_CATCH()
> block in do_autovacuum(). If we write the log in
> AutoVacuumReleaseParallelWorkers(), the tap test is able to check the
> log, no?
>
Not quite. Assume that we add "%d workers have been released" log to the
ReleaseAllParallelWorkers. Then we trigger an error for a/v leader and wait
for this log (we are expecting that workers will be released inside the
try/catch block).
Even if there is a bug in the code and a/v leader cannot release parallel
workers due to occured error, one day it will finish vacuuming and call
"proc_exit". During "proc_exit" the "before_shmem_exit_hook" along with
the "ReleaseAllParallelWorkers" will be called.
I.e. we will see the desired log, and we will mistakenly consider this test
passed.
> > Also, I don't know whether the 5th test needs this log at all, because in
> > the end we are checking the number of free parallel workers. If a killed
> > a/v leader doesn't release parallel workers, we'll notice it.
>
> If we can check the log written at process shutdown time, I think we
> can somewhat simplify the test 5 logic by not attaching
> 'autovacuum-start-parallel-vacuum' injection point.
>
> 1. attach 'autovacuum-leader-before-indexes-processing' injection point.
> 2. wait for an av worker to stop at the injection point.
> 3. terminate the av worker.
> 4. verify from the log if the workers have been released.
> 5. disable parallel autovacuum.
> 6. check the free workers (should be 10).
>
> Step 5 and 6 seems to be optional though.
OK, I see your point. But I'm afraid that the "%d released" log can't help
us here for the reason I described above :
"%d released" can be called from several places and we cannot be sure
which one has emitted this log.
I suppose to do the same as we did for try/catch block - add logging inside
the "autovacuum_worker_before_shmem_exit" with some unique message.
Thus, we will be sure that the workers are released precisely in the
"before_shmem_exit_hook".
The alternative is to pass some additional information to the
"ReleaseAllParallelWorkers" function (to supplement the log it emits), but it
doesn't seem like a good solution to me.
**Comments on the 0001 patch**
> + /*
> + * Max number of parallel autovacuum workers. If value is 0 then parallel
> + * degree will computed based on number of indexes.
> + */
> + int autovacuum_parallel_workers;
>
> I'm a bit concerned that the above description doesn't explain what
> number of parallel vacuum workers are used in >0 as it mentioned only
> the maximum number. How about rewording it to:
>
> Target number of parallel autovacuum workers. -1 by default disables
> parallel vacuum during autovacuum. 0 means choose the parallel degree
> based on the number of indexes.
>
I agree.
**Comments on the 0002 patch**
> + PVWorkersUsage workers_usage;
> /* Counters that follow are only for scanned_pages */
> int64 tuples_deleted; /* # deleted from table */
>
> Let's insert a new line between the new line and the existing line.
>
OK
> + if (AmAutoVacuumWorkerProcess())
> + {
> + /* Worker usage stats for parallel autovacuum. */
> + appendStringInfo(&buf,
> + _("parallel workers: index
> vacuum: %d planned, %d reserved, %d launched in total\n"),
> + vacrel->workers_usage.vacuum.nplanned,
> + vacrel->workers_usage.vacuum.nreserved,
> + vacrel->workers_usage.vacuum.nlaunched);
> + }
> + else
> + {
> + /* Worker usage stats for manual VACUUM (PARALLEL). */
> + appendStringInfo(&buf,
> + _("parallel workers: index
> vacuum: %d planned, %d launched in total\n"),
> + vacrel->workers_usage.vacuum.nplanned,
> + vacrel->workers_usage.vacuum.nlaunched);
> + }
> + }
>
> These comments are very obvious so I don't think we need them.
I agree.
> Instead, I think it would be good to explain why we don't need to
> report "reserved" numbers in the manual vacuum cases.
>
I think that we can clarify somewhere why the "reserved" statistic
is collected only for autovacuum. PVWorkersStats is an appropriate
place for it. Thus, there will be no need to write something during
constructing the log.
> ---
> + if (vacrel->workers_usage.vacuum.nplanned > 0)
> + {
> + /* Stats for vacuum phase of index vacuuming. */
>
> and
>
> + if (vacrel->workers_usage.cleanup.nplanned > 0)
> + {
> + /* Stats for cleanup phase of index vacuuming. */
> +
>
> I don't think we need these comments (the second one has a typo
> though) as it's obvious.
>
I agree.
> */
> void
> parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long
> num_table_tuples,
> - int num_index_scans)
> + int num_index_scans, PVWorkersUsage *wusage)
>
> Please add a brief description of wusage to the function comment. We
> can add comments to both parallel_vacuum_bulkldel_all_indexes() and
> parallel_vacuum_cleanup_all_indexes() or only
> parallel_vacuum_process_all_indexes().
>
OK. I think that adding a comment only to the
parallel_vacuum_process_all_indexes will be more appropriate.
(I'm not sure if the comment I came up with looks good, but I couldn't
formulate it better).
>
> PVWorkersUsage is added twice
>
Oops
**Comments on the 0003 patch**
>
> +#define VacCostParamsEquals(params) \
> + (vacuum_cost_delay == (params).cost_delay && \
> + vacuum_cost_limit == (params).cost_limit && \
> + VacuumCostPageDirty == (params).cost_page_dirty && \
> + VacuumCostPageHit == (params).cost_page_hit && \
> + VacuumCostPageMiss == (params).cost_page_miss)
>
> I'm not sure this macro helps reduce lines of code or improve
> readability as it's used only once and it's slightly unnatural to me
> that *Equals macro takes only one argument.
>
I agree, it looks a bit odd. I'll remove it.
Moreover, this shmem state can be updated only by the a/v leader worker,
so I'll allow it to read shared variables without holding a spinlock.
It seems pretty reliable, what do you think?
**Comments on the 0004 patch**
> +#include "commands/vacuum.h"
> +#include "fmgr.h"
> +#include "miscadmin.h"
> +#include "postmaster/autovacuum.h"
> +#include "storage/shmem.h"
> +#include "storage/ipc.h"
> +#include "storage/lwlock.h"
> +#include "utils/builtins.h"
> +#include "utils/injection_point.h"
>
> We can remove some unnecessary header includes. ISTM we need only
> fmgr.h, autovacuum.h, and injection_point.h.
>
Agree, I'll remove unused includes.
> + const char *msg_format =
> + _("Parallel autovacuum worker cost params: cost_limit=%d,
> cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d,
> cost_page_hit=%d");
> +
>
> I don't think we need the translation for this message as it's not a
> user-facing one.
>
> We don't capitalize the first letter in the error message.
>
I agree.
> ---
> + ereport(DEBUG2,
> + (errmsg("number of free parallel autovacuum workers is set
> to %u due to config reload",
> + AutoVacuumShmem->av_freeParallelWorkers),
> + errhidecontext(true)));
>
> Why do we need to add errhidecontext(true) here?
>
I thought we don't need to write redundant info to the logfile. But I
don't see that other DEBUG2 messages are hiding context, so
I'll remove it.
BTW, do we want to use "elog" here too?
> ---
> + 'tests': [
> + 't/001_basic.pl',
> + ],
>
> Need to be updated to the new filename.
>
> ---
> + * Copyright (c) 2020-2025, PostgreSQL Global Development Group
>
> Please update the copyright years.
>
Yeah, I forgot about it. Will fix it.
Thank you very much for the review!
Please, see the updated set of patches.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v23-0003-Cost-based-parameters-propagation-for-parallel-a.patch (9.8K, 2-v23-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From f91c52c22340349f9c2459ff565408e75b4d9197 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v23 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 23 +++-
src/backend/commands/vacuumparallel.c | 163 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 2 +
5 files changed, 189 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 03932f45c8a..70882544d05 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2430,8 +2430,21 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Possibly update cost-based delay parameters.
+ *
+ * Do it before checking VacuumCostActive, because its value might be
+ * changed after calling this function.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2445,6 +2458,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * If we are parallel autovacuum leader and some of cost-based
+ * parameters had changed, let other parallel workers know.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 643849b2fb8..80b57bf9da3 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -54,6 +54,52 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Helper for the PVSharedCostParams structure (see below), to avoid
+ * repetition.
+ */
+typedef struct VacuumCostParams
+{
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} VacuumCostParams;
+
+#define FillVacCostParams(cost_params) \
+ (cost_params)->cost_delay = vacuum_cost_delay; \
+ (cost_params)->cost_limit = vacuum_cost_limit; \
+ (cost_params)->cost_page_dirty = VacuumCostPageDirty; \
+ (cost_params)->cost_page_hit = VacuumCostPageHit; \
+ (cost_params)->cost_page_miss = VacuumCostPageMiss
+
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * Each time leader worker updates its parameters, it must increase
+ * generation. Every parallel worker keeps the generation
+ * (shared_params_local_generation) at which it had last time received
+ * parameters from the leader.
+ *
+ * It is enough for worker to compare it's local_generation with the field
+ * below to determine whether it needs to receive new parameters' values.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /*
+ * Copies of the corresponding cost-based vacuum delay parameters from
+ * autovacuum leader process.
+ */
+ VacuumCostParams params_data;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -123,6 +169,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -225,6 +283,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments for the PVSharedCostParams structure for the explanation. */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -396,6 +459,17 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ if (shared->is_autovacuum)
+ {
+ FillVacCostParams(&shared->cost_params.params_data);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -540,6 +614,92 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
&wusage->cleanup);
}
+/*
+ * If we are parallel *autovacuum* worker, check whether related to cost-based
+ * vacuum delay parameters had changed in the leader worker. If so,
+ * corresponding parameters will be updated to the values which leader worker
+ * is operating on.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+
+ VacuumCostDelay = pv_shared_cost_params->params_data.cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->params_data.cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->params_data.cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->params_data.cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->params_data.cost_page_miss;
+
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Function to be called from parallel autovacuum leader in order to propagate
+ * some cost-based vacuum delay parameters to the supportive workers.
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ VacuumCostParams *params_data;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Only leader worker can modify this shared structure, so we can read it
+ * without acquiring a lock.
+ */
+ params_data = &pv_shared_cost_params->params_data;
+
+ if (vacuum_cost_delay == params_data->cost_delay &&
+ vacuum_cost_limit == params_data->cost_limit &&
+ VacuumCostPageDirty == params_data->cost_page_dirty &&
+ VacuumCostPageHit == params_data->cost_page_hit &&
+ VacuumCostPageMiss == params_data->cost_page_miss)
+ {
+ /*
+ * We don't need to update shared cost-based vacuum delay params if
+ * they haven't changed.
+ */
+ return;
+ }
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ FillVacCostParams(&pv_shared_cost_params->params_data);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increase generation of the parameters, i.e. let parallel workers know
+ * that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1109,6 +1269,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f40abe90ed5..0d78d02bd09 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1690,7 +1690,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1b1fb625cb2..4bfeba8264d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -434,6 +434,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 536237ff546..de9f576e0f3 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2070,6 +2070,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkersUsage
PVWorkersStats
PX_Alias
@@ -3249,6 +3250,7 @@ VacAttrStatsP
VacDeadItemsInfo
VacErrPhase
VacOptValue
+VacuumCostParams
VacuumParams
VacuumRelation
VacuumStmt
--
2.43.0
[text/x-patch] v23-0002-Logging-for-parallel-autovacuum.patch (10.2K, 3-v23-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 4b4fe2edb57dc6d910dc7a37161bede4e752d982 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v23 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 54 ++++++++++++++++++++++++++-
src/backend/commands/vacuumparallel.c | 32 +++++++++++++---
src/include/commands/vacuum.h | 39 ++++++++++++++++++-
src/tools/pgindent/typedefs.list | 2 +
4 files changed, 117 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5d271d80967..1e2d5be1af2 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -342,6 +342,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -780,6 +787,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1122,6 +1134,42 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ if (AmAutoVacuumWorkerProcess())
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d reserved, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nreserved,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ else
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ }
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ if (AmAutoVacuumWorkerProcess())
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d reserved, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nreserved,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ else
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2666,7 +2714,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3099,7 +3148,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 806a7f48326..643849b2fb8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -228,7 +228,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -503,7 +503,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -514,7 +514,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -522,7 +523,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -534,7 +536,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -616,10 +619,13 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the statistics it stores will be updated according
+ * to what happens during function execution.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -656,13 +662,23 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
*/
if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ {
AutoVacuumReserveParallelWorkers(&nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nreserved += nworkers;
+ }
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -729,6 +745,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..1b1fb625cb2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,39 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ */
+typedef struct PVWorkersStats
+{
+ /* Number of parallel workers we are planned to launch */
+ int nplanned;
+
+ /*
+ * Number of parallel workers we have managed to reserve.
+ *
+ * Note, that we collect this stats only for the parallel *autovacuum*
+ * since during it we must reserve workers in shared state before actually
+ * trying to launch them (in order to meet the
+ * autovacuum_max_parallel_workers limit). Manual VACUUM (PARALLEL), on
+ * the contrary, doesn't need to reserve workers.
+ */
+ int nreserved;
+
+ /* Number of launched parallel workers */
+ int nlaunched;
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched, reserved
+ * and planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +427,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 77e3c04144e..536237ff546 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2070,6 +2070,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v23-0001-Parallel-autovacuum.patch (19.4K, 4-v23-0001-Parallel-autovacuum.patch)
download | inline diff:
From 24bef2b3e5736b363d113882fa9a53475c47fc36 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v23 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 8 +
11 files changed, 240 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..9459a010cc3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..806a7f48326 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
@@ -374,8 +377,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -554,12 +558,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -598,8 +607,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -647,6 +656,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -691,6 +707,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -739,6 +765,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseAllParallelWorkers();
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6fde740465f..f40abe90ed5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,13 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +292,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -361,6 +372,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -759,6 +771,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -775,6 +789,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1379,6 +1402,16 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -2275,6 +2308,12 @@ do_autovacuum(void)
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that all
+ * reserved workers are released even after FATAL error.
+ */
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Perform operations on collected tables.
*/
@@ -2456,6 +2495,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2856,8 +2901,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3334,6 +3383,88 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Reserves parallel workers for autovacuum.
+ *
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
+ *
+ * The caller must call AutoVacuumRelease[All]ParallelWorkers() to release the
+ * reserved workers.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader autovacuum worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* The worker must not have any reserved workers yet */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+}
+
+/*
+ * Releases the reserved parallel workers for autovacuum.
+ *
+ * This function should be used to release the parallel workers that an
+ * autovacuum worker reserved by AutoVacuumReserveParallelWorkers(). nworkers
+ * is the number of workers to release, which must not be greater than the
+ * number of workers currently reserved, av_nworkers_reserved.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Cannot release more workers than reserved */
+ Assert(nworkers <= av_nworkers_reserved);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+}
+
+/*
+ * Same as above, but this function releases all the parallel workers that
+ * this autovacuum worker reserved.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3394,6 +3525,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_parallel_workers);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3475,3 +3610,28 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ int nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d77502838c4..4a5c73a9e33 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 9507778415d..92b69c65e83 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f938cc65a3a..ef8126f3790 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -710,6 +710,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index b2dba6d10ab..414c5e61db0 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..f3783afb51b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,6 +62,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..11dd3aebc6c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,14 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Target number of parallel autovacuum workers. -1 by default disables
+ * parallel vacuum during autovacuum. 0 means choose the parallel degree
+ * based on the number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v23-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 5-v23-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From d208fa25894c2c417091aee781df4fd7a45ecc90 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v23 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f670e2d4c31..07139ec7ff2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9380,6 +9381,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-parallel-workers"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..4894de021cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v23-0004-Tests-for-parallel-autovacuum.patch (21.3K, 6-v23-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 438f0b699f63bdeee7e0da17859d401cc8df92da Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v23 4/5] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 30 ++
src/backend/postmaster/autovacuum.c | 38 +++
src/include/postmaster/autovacuum.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 28 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../t/001_parallel_autovacuum.pl | 299 ++++++++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 +
.../modules/test_autovacuum/test_autovacuum.c | 35 ++
.../test_autovacuum/test_autovacuum.control | 3 +
13 files changed, 495 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1e2d5be1af2..510e5846cd2 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -872,6 +873,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 80b57bf9da3..1d352b44e44 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -40,6 +40,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -925,6 +926,19 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ *
+ * This injection point is also used to wait until parallel workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1302,6 +1316,22 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * If we are parallel autovacuum worker, we can consume delay parameters
+ * during index processing (via vacuum_delay_point call). This logging
+ * allows tests to ensure this.
+ */
+ if (shared->is_autovacuum)
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+#endif
+
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 0d78d02bd09..d17fc92e783 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1409,7 +1409,18 @@ avl_sigusr2_handler(SIGNAL_ARGS)
static void
autovacuum_worker_before_shmem_exit(int code, Datum arg)
{
+ int nreserved_old = av_nworkers_reserved;
+
AutoVacuumReleaseAllParallelWorkers();
+
+ if (nreserved_old > 0)
+ {
+ elog(DEBUG2,
+ ngettext("autovacuum worker before_shmem_exit: %d parallel worker has been released",
+ "autovacuum worker before_shmem_exit: %d parallel workers has been released",
+ nreserved_old - av_nworkers_reserved),
+ nreserved_old - av_nworkers_reserved);
+ }
}
/*
@@ -2495,12 +2506,20 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ int nreserved_workers = av_nworkers_reserved;
+
/*
* Parallel autovacuum can reserve parallel workers. Make sure
* that all reserved workers are released.
*/
AutoVacuumReleaseAllParallelWorkers();
+ if (nreserved_workers > 0)
+ ereport(DEBUG2,
+ (errmsg("%d parallel autovacuum workers has been released after occured error",
+ nreserved_workers),
+ errhidecontext(true)));
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -3465,6 +3484,21 @@ AutoVacuumReleaseAllParallelWorkers(void)
Assert(av_nworkers_reserved == 0);
}
+/*
+ * Get number of free autovacuum parallel workers.
+ */
+uint32
+AutoVacuumGetFreeParallelWorkers(void)
+{
+ uint32 nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_SHARED);
+ nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
+ LWLockRelease(AutovacuumLock);
+
+ return nfree_workers;
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3633,5 +3667,9 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+ elog(DEBUG2,
+ "number of free parallel autovacuum workers is set to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index f3783afb51b..52be260e15f 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
extern void AutoVacuumReleaseAllParallelWorkers(void);
+extern uint32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 44c7163c1cd..937dbb64fd2 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..5ac8d87702d 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..32254c53a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..75b24814b13
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..edfbde73aac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,299 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ });
+
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+}
+
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$log_start = $node->wait_for_log(
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ $log_start
+);
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 20, 'All parallel workers has been released by the leader');
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# Test 3:
+# Test adjustment of free parallel workers number when changing
+# autovacuum_max_parallel_workers parameter
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 1;
+ SELECT pg_reload_conf();
+});
+
+# Since 2 parallel workers already launched and will be released in the future,
+# we are expecting that :
+# 1) number of free workers will be '0' after config reload
+# 2) number of free workers will be '1' after releasing workers
+
+# Check statement (1)
+$log_start = $node->wait_for_log(
+ qr/number of free parallel autovacuum workers is set to 0 due to config reload/,
+ $log_start
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until the end of parallel processing
+$log_start = $node->wait_for_log(
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ $log_start
+);
+
+# Check statement (2)
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 1, 'Number of free parallel workers is consistent');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
+ SELECT pg_reload_conf();
+});
+
+# Test 4:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exits due to an ERROR.
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'error');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$log_start = $node->wait_for_log(
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/,
+ $log_start
+);
+
+$log_start = $node->wait_for_log(
+ qr/2 parallel autovacuum workers has been released after occured error/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 5:
+# Same as above test, but simulate situation, when leader exits due to FATAL.
+
+prepare_for_next_test($node, 5);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until parallel workers are reserved autovacuum and kill the leader
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+my $av_pid = $node->safe_psql('postgres', qq{
+ SELECT pid FROM pg_stat_activity
+ WHERE backend_type = 'autovacuum worker'
+ AND wait_event = 'autovacuum-leader-before-indexes-processing'
+ LIMIT 1;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
+
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker before_shmem_exit: 2 parallel workers has been released/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..e5646e0def5
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting shared autovacuum state
+ */
+
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..195a6149a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,35 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "postmaster/autovacuum.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_UINT32(nfree_workers);
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v22--v23-diff-for-0004.patch (7.4K, 7-v22--v23-diff-for-0004.patch)
download | inline diff:
From 5fa08ede2171b68b5b4652743d6401e3f6652f98 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 1 Mar 2026 21:36:00 +0700
Subject: [PATCH 3/3] fixes for 0004 patch
---
src/backend/commands/vacuumparallel.c | 33 ++++---------------
src/backend/postmaster/autovacuum.c | 18 +++++++---
src/test/modules/test_autovacuum/meson.build | 2 +-
.../t/001_parallel_autovacuum.pl | 26 ++-------------
.../modules/test_autovacuum/test_autovacuum.c | 8 +----
5 files changed, 26 insertions(+), 61 deletions(-)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 54210a4971d..1d352b44e44 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -301,10 +301,6 @@ static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_inde
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
-#ifdef USE_INJECTION_POINTS
-static inline void parallel_vacuum_report_cost_based_params(void);
-#endif
-
/*
* Try to enter parallel mode and create a parallel context. Then initialize
* shared memory state.
@@ -1327,7 +1323,13 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
* allows tests to ensure this.
*/
if (shared->is_autovacuum)
- parallel_vacuum_report_cost_based_params();
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
#endif
/* Report buffer/WAL usage during parallel execution */
@@ -1382,24 +1384,3 @@ parallel_vacuum_error_callback(void *arg)
return;
}
}
-
-#ifdef USE_INJECTION_POINTS
-/*
- * Log values related to cost-based vacuum delay parameters. It is used for
- * testing purpose.
- */
-static inline void
-parallel_vacuum_report_cost_based_params(void)
-{
- const char *msg_format =
- _("Parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d");
-
- elog(DEBUG2,
- msg_format,
- vacuum_cost_limit,
- vacuum_cost_delay,
- VacuumCostPageMiss,
- VacuumCostPageDirty,
- VacuumCostPageHit);
-}
-#endif
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 7b24a5d6e67..d17fc92e783 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1409,7 +1409,18 @@ avl_sigusr2_handler(SIGNAL_ARGS)
static void
autovacuum_worker_before_shmem_exit(int code, Datum arg)
{
+ int nreserved_old = av_nworkers_reserved;
+
AutoVacuumReleaseAllParallelWorkers();
+
+ if (nreserved_old > 0)
+ {
+ elog(DEBUG2,
+ ngettext("autovacuum worker before_shmem_exit: %d parallel worker has been released",
+ "autovacuum worker before_shmem_exit: %d parallel workers has been released",
+ nreserved_old - av_nworkers_reserved),
+ nreserved_old - av_nworkers_reserved);
+ }
}
/*
@@ -3656,10 +3667,9 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
- ereport(DEBUG2,
- (errmsg("number of free parallel autovacuum workers is set to %u due to config reload",
- AutoVacuumShmem->av_freeParallelWorkers),
- errhidecontext(true)));
+ elog(DEBUG2,
+ "number of free parallel autovacuum workers is set to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers);
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
index 3441e5e49cf..75b24814b13 100644
--- a/src/test/modules/test_autovacuum/meson.build
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -30,7 +30,7 @@ tests += {
'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
},
'tests': [
- 't/001_basic.pl',
+ 't/001_parallel_autovacuum.pl',
],
},
}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index 9b80d371f5c..edfbde73aac 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -161,7 +161,7 @@ $node->wait_for_event(
# Check whether parallel worker successfully updated all parameters during
# index processing
$log_start = $node->wait_for_log(
- qr/Parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
$log_start
);
@@ -264,20 +264,11 @@ $node->safe_psql('postgres', qq{
prepare_for_next_test($node, 5);
$node->safe_psql('postgres', qq{
- SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
ALTER TABLE test_autovac SET (autovacuum_enabled = true);
});
-# Wait until parallel autovacuum is inited and wake up the leader
-$node->wait_for_event(
- 'autovacuum worker',
- 'autovacuum-start-parallel-vacuum'
-);
-$node->safe_psql('postgres', qq{
- SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
-});
-
+# Wait until parallel workers are reserved autovacuum and kill the leader
$node->wait_for_event(
'autovacuum worker',
'autovacuum-leader-before-indexes-processing'
@@ -295,23 +286,12 @@ $node->safe_psql('postgres', qq{
});
$log_start = $node->wait_for_log(
- qr/terminating autovacuum process due to administrator command/,
+ qr/autovacuum worker before_shmem_exit: 2 parallel workers has been released/,
$log_start
);
-# Now it is safe to check the number of free parallel workers, because even if
-# autovacuum is trying to vacuum table in parallel mode again, the leader
-# worker cannot go any further than "autovacuum-start-parallel-vacuum" point.
-# I.e. no one can interfere and change the number of free parallel workers.
-
-$psql_out = $node->safe_psql('postgres', qq{
- SELECT get_parallel_autovacuum_free_workers();
-});
-is($psql_out, 10, 'All parallel workers has been released by the leader after FATAL');
-
# Cleanup
$node->safe_psql('postgres', qq{
- SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
});
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
index 959629c7685..195a6149a5d 100644
--- a/src/test/modules/test_autovacuum/test_autovacuum.c
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -3,7 +3,7 @@
* test_autovacuum.c
* Helpers to write tests for parallel autovacuum
*
- * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
*
* IDENTIFICATION
* src/test/modules/test_autovacuum/test_autovacuum.c
@@ -13,14 +13,8 @@
#include "postgres.h"
-#include "commands/vacuum.h"
#include "fmgr.h"
-#include "miscadmin.h"
#include "postmaster/autovacuum.h"
-#include "storage/shmem.h"
-#include "storage/ipc.h"
-#include "storage/lwlock.h"
-#include "utils/builtins.h"
#include "utils/injection_point.h"
PG_MODULE_MAGIC;
--
2.43.0
[text/x-patch] v22--v23-diff-for-0003.patch (2.3K, 8-v22--v23-diff-for-0003.patch)
download | inline diff:
From 57a2f96d489fe9c2b81b0310d7b9efcdd45460bc Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 28 Feb 2026 18:36:08 +0700
Subject: [PATCH 1/3] fixes for 0003 patch
---
src/backend/commands/vacuumparallel.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0ee4019f561..80b57bf9da3 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -74,13 +74,6 @@ typedef struct VacuumCostParams
(cost_params)->cost_page_hit = VacuumCostPageHit; \
(cost_params)->cost_page_miss = VacuumCostPageMiss
-#define VacCostParamsEquals(params) \
- (vacuum_cost_delay == (params).cost_delay && \
- vacuum_cost_limit == (params).cost_limit && \
- VacuumCostPageDirty == (params).cost_page_dirty && \
- VacuumCostPageHit == (params).cost_page_hit && \
- VacuumCostPageMiss == (params).cost_page_miss)
-
/*
* Struct for cost-based vacuum delay related parameters to share among an
* autovacuum worker and its parallel vacuum workers.
@@ -669,24 +662,34 @@ parallel_vacuum_update_shared_delay_params(void)
void
parallel_vacuum_propagate_shared_delay_params(void)
{
+ VacuumCostParams *params_data;
+
Assert(AmAutoVacuumWorkerProcess());
/* Check whether we are running parallel autovacuum */
if (pv_shared_cost_params == NULL)
return;
- SpinLockAcquire(&pv_shared_cost_params->mutex);
+ /*
+ * Only leader worker can modify this shared structure, so we can read it
+ * without acquiring a lock.
+ */
+ params_data = &pv_shared_cost_params->params_data;
- if (VacCostParamsEquals(pv_shared_cost_params->params_data))
+ if (vacuum_cost_delay == params_data->cost_delay &&
+ vacuum_cost_limit == params_data->cost_limit &&
+ VacuumCostPageDirty == params_data->cost_page_dirty &&
+ VacuumCostPageHit == params_data->cost_page_hit &&
+ VacuumCostPageMiss == params_data->cost_page_miss)
{
/*
* We don't need to update shared cost-based vacuum delay params if
* they haven't changed.
*/
- SpinLockRelease(&pv_shared_cost_params->mutex);
return;
}
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
FillVacCostParams(&pv_shared_cost_params->params_data);
SpinLockRelease(&pv_shared_cost_params->mutex);
--
2.43.0
[text/x-patch] v22--v23-diff-for-0001.patch (890B, 9-v22--v23-diff-for-0001.patch)
download | inline diff:
From 6242a38164eb1f81fb8735ea508100685a6ffa9a Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 28 Feb 2026 17:33:45 +0700
Subject: [PATCH] fixes for 0001 patch
---
src/include/utils/rel.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 7c5e35a486c..11dd3aebc6c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -313,8 +313,9 @@ typedef struct AutoVacOpts
bool enabled;
/*
- * Max number of parallel autovacuum workers. If value is 0 then parallel
- * degree will computed based on number of indexes.
+ * Target number of parallel autovacuum workers. -1 by default disables
+ * parallel vacuum during autovacuum. 0 means choose the parallel degree
+ * based on the number of indexes.
*/
int autovacuum_parallel_workers;
--
2.43.0
[text/x-patch] v22--v23-diff-for-0002.patch (4.6K, 10-v22--v23-diff-for-0002.patch)
download | inline diff:
From bc4479dab3a81fc26e15b55f1c053c46c6cc5279 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 28 Feb 2026 18:11:58 +0700
Subject: [PATCH] fixes for 0002 patch
---
src/backend/access/heap/vacuumlazy.c | 9 +--------
src/backend/commands/vacuumparallel.c | 3 +++
src/include/commands/vacuum.h | 21 ++++++++++++++++-----
src/tools/pgindent/typedefs.list | 1 -
4 files changed, 20 insertions(+), 14 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 055ddf566dc..1e2d5be1af2 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -348,6 +348,7 @@ typedef struct LVRelState
* index scans.
*/
PVWorkersUsage workers_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -1135,11 +1136,8 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->lpdead_items);
if (vacrel->workers_usage.vacuum.nplanned > 0)
{
- /* Stats for vacuum phase of index vacuuming. */
-
if (AmAutoVacuumWorkerProcess())
{
- /* Worker usage stats for parallel autovacuum. */
appendStringInfo(&buf,
_("parallel workers: index vacuum: %d planned, %d reserved, %d launched in total\n"),
vacrel->workers_usage.vacuum.nplanned,
@@ -1148,7 +1146,6 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
}
else
{
- /* Worker usage stats for manual VACUUM (PARALLEL). */
appendStringInfo(&buf,
_("parallel workers: index vacuum: %d planned, %d launched in total\n"),
vacrel->workers_usage.vacuum.nplanned,
@@ -1157,11 +1154,8 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
}
if (vacrel->workers_usage.cleanup.nplanned > 0)
{
- /* Stats for cleanup phase of index vacuuming. */
-
if (AmAutoVacuumWorkerProcess())
{
- /* Worker usage stats for parallel autovacuum. */
appendStringInfo(&buf,
_("parallel workers: index cleanup: %d planned, %d reserved, %d launched\n"),
vacrel->workers_usage.cleanup.nplanned,
@@ -1170,7 +1164,6 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
}
else
{
- /* Worker usage stats for manual VACUUM (PARALLEL). */
appendStringInfo(&buf,
_("parallel workers: index cleanup: %d planned, %d launched\n"),
vacrel->workers_usage.cleanup.nplanned,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 177264cb2e6..643849b2fb8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -619,6 +619,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the statistics it stores will be updated according
+ * to what happens during function execution.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index d3dc4e8cc67..1b1fb625cb2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -305,11 +305,22 @@ typedef struct VacDeadItemsInfo
*/
typedef struct PVWorkersStats
{
- int nplanned; /* # of parallel workers we are planned to
- * launch */
- int nreserved; /* for autovacuum only - # of parallel workers
- * we have managed to reserve */
- int nlaunched; /* # of launched parallel workers */
+ /* Number of parallel workers we are planned to launch */
+ int nplanned;
+
+ /*
+ * Number of parallel workers we have managed to reserve.
+ *
+ * Note, that we collect this stats only for the parallel *autovacuum*
+ * since during it we must reserve workers in shared state before actually
+ * trying to launch them (in order to meet the
+ * autovacuum_max_parallel_workers limit). Manual VACUUM (PARALLEL), on
+ * the contrary, doesn't need to reserve workers.
+ */
+ int nreserved;
+
+ /* Number of launched parallel workers */
+ int nlaunched;
} PVWorkersStats;
/*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 6ceb6cac14f..536237ff546 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2410,7 +2410,6 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
-PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-02 22:25 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-02 22:25 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sun, Mar 1, 2026 at 6:46 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Sat, Feb 28, 2026 at 8:57 AM Masahiko Sawada <[email protected]> wrote:
> >
> > IIUC earlier patches defined autovacuum_max_parallel_workers with the
> > limit by max_worker_processes. Suppose we set:
> >
> > - max_worker_processes = 8
> > - autovacuum_max_parallel_workers = 4
> > - max_parallel_workers = 4
> >
> > If we want to disable all parallel operations, we would need to set
> > max_parallel_workers to 0 as well as either
> > autovacuum_max_parallel_workers to 0, no? This is because if we set
> > only max_parallel_workers to 0, autovacuum workers still can take
> > parallel vacuum workers from the max_worker_processes pool. I might be
> > missing something though.
> >
>
> Even if av_max_parallel_workers is limited by max_worker_processes,
> it is enough to set max_parallel_workers to 0 to disable parallel
> autovacuum.
>
> When a/v leader wants to create supportive workers, it calls
> "RegisterDynamicBackgroundWorker" function, which contain following
> logic :
> /*
> * If this is a parallel worker, check whether there are already too many
> * parallel workers; if so, don't register another one.
> */
> if (parallel && (BackgroundWorkerData->parallel_register_count -
> BackgroundWorkerData->parallel_terminate_count) >=
> max_parallel_workers)
> {
> ....
> }
>
> Thus, a/v leader cannot launch any workers if max_parallel_workers is set to 0.
Right. But this fact would actually support that limiting
autovacuum_max_parallel_workers by max_parallel_workers is more
appropriate, no?
>
> > > > If we write the log "%d parallel autovacuum workers have been
> > > > released" in AutoVacuumReleaseParallelWorkres(), can we simplify both
> > > > tests (4 and 5) further?
> > > >
> > >
> > > It won't help the 4th test, because ReleaseParallelWorkers is called
> > > due to both ERROR and shmem_exit, but we want to be sure that
> > > workers are released in the try/catch block (i.e. before the shmem_exit).
> >
> > We already call AutoVacuumReleaseAllParallelWorker() in the PG_CATCH()
> > block in do_autovacuum(). If we write the log in
> > AutoVacuumReleaseParallelWorkers(), the tap test is able to check the
> > log, no?
> >
>
> Not quite. Assume that we add "%d workers have been released" log to the
> ReleaseAllParallelWorkers. Then we trigger an error for a/v leader and wait
> for this log (we are expecting that workers will be released inside the
> try/catch block).
>
> Even if there is a bug in the code and a/v leader cannot release parallel
> workers due to occured error, one day it will finish vacuuming and call
> "proc_exit". During "proc_exit" the "before_shmem_exit_hook" along with
> the "ReleaseAllParallelWorkers" will be called.
What bugs are you concerned about in this case? I'm not sure what you
meant by "a/v leader cannot release parallel workers due to occured
error". It sounds like you mentioned a case where there is a bug in
AutoVacuumReleaseParallelWorkers() but if there is the bug and the
leader failed to release parallel workers, we would end up not writing
these elogs in either case.
>
> > > Also, I don't know whether the 5th test needs this log at all, because in
> > > the end we are checking the number of free parallel workers. If a killed
> > > a/v leader doesn't release parallel workers, we'll notice it.
> >
> > If we can check the log written at process shutdown time, I think we
> > can somewhat simplify the test 5 logic by not attaching
> > 'autovacuum-start-parallel-vacuum' injection point.
> >
> > 1. attach 'autovacuum-leader-before-indexes-processing' injection point.
> > 2. wait for an av worker to stop at the injection point.
> > 3. terminate the av worker.
> > 4. verify from the log if the workers have been released.
> > 5. disable parallel autovacuum.
> > 6. check the free workers (should be 10).
> >
> > Step 5 and 6 seems to be optional though.
>
> OK, I see your point. But I'm afraid that the "%d released" log can't help
> us here for the reason I described above :
> "%d released" can be called from several places and we cannot be sure
> which one has emitted this log.
>
> I suppose to do the same as we did for try/catch block - add logging inside
> the "autovacuum_worker_before_shmem_exit" with some unique message.
> Thus, we will be sure that the workers are released precisely in the
> "before_shmem_exit_hook".
>
> The alternative is to pass some additional information to the
> "ReleaseAllParallelWorkers" function (to supplement the log it emits), but it
> doesn't seem like a good solution to me.
I'm not sure if it's important to check how
AutoVacuumReleaseAllParallelWorkers() has been called (either in
PG_CATCH() block or by autovacuum_worker_before_shmem_exit()). We
would end up having to add a unique message to each caller of
AutoVacuumReleaseAllParallelWorkers() in the future. I guess it's more
important to make sure that all workers have been released in the end.
In that sense, it would make more sense to check that all workers have
actually been released (i.e., checking by
get_parallel_autovacuum_free_workers()) after a parallel vacuum
instead of checking workers being released by debug logs. That is, we
can check at each test end if get_parallel_autovacuum_free_workers()
returns the expected number after disabling parallel autovacuum.
>
> > + if (AmAutoVacuumWorkerProcess())
> > + {
> > + /* Worker usage stats for parallel autovacuum. */
> > + appendStringInfo(&buf,
> > + _("parallel workers: index
> > vacuum: %d planned, %d reserved, %d launched in total\n"),
> > + vacrel->workers_usage.vacuum.nplanned,
> > + vacrel->workers_usage.vacuum.nreserved,
> > + vacrel->workers_usage.vacuum.nlaunched);
> > + }
> > + else
> > + {
> > + /* Worker usage stats for manual VACUUM (PARALLEL). */
> > + appendStringInfo(&buf,
> > + _("parallel workers: index
> > vacuum: %d planned, %d launched in total\n"),
> > + vacrel->workers_usage.vacuum.nplanned,
> > + vacrel->workers_usage.vacuum.nlaunched);
> > + }
> > + }
> >
> > These comments are very obvious so I don't think we need them.
>
> I agree.
>
> > Instead, I think it would be good to explain why we don't need to
> > report "reserved" numbers in the manual vacuum cases.
> >
>
> I think that we can clarify somewhere why the "reserved" statistic
> is collected only for autovacuum. PVWorkersStats is an appropriate
> place for it. Thus, there will be no need to write something during
> constructing the log.
On second thoughts on the "planned" and "reserved", can we consider
what the patch implemented as "reserved" as the "planned" in
autovacuum cases? That is, in autovacuum cases, the "planned" number
considers the number of parallel degrees based on the number of
indexes (or autovacuum_parallel_workers value) as well as the number
of workers that have actually been reserved. In cases of
autovacuum_max_parallel_workers shortage, users would notice by seeing
logs that enough workers are not planned in the first place against
the number of indexes on the table. That might be less confusing for
users rather than introducing a new "reserved" concept in the vacuum
logs. Also, it slightly helps simplify the codes.
>
> **Comments on the 0003 patch**
>
> >
> > +#define VacCostParamsEquals(params) \
> > + (vacuum_cost_delay == (params).cost_delay && \
> > + vacuum_cost_limit == (params).cost_limit && \
> > + VacuumCostPageDirty == (params).cost_page_dirty && \
> > + VacuumCostPageHit == (params).cost_page_hit && \
> > + VacuumCostPageMiss == (params).cost_page_miss)
> >
> > I'm not sure this macro helps reduce lines of code or improve
> > readability as it's used only once and it's slightly unnatural to me
> > that *Equals macro takes only one argument.
> >
>
> I agree, it looks a bit odd. I'll remove it.
> Moreover, this shmem state can be updated only by the a/v leader worker,
> so I'll allow it to read shared variables without holding a spinlock.
> It seems pretty reliable, what do you think?
Right. It's safe for the leader to read these fields without locks.
> > ---
> > + ereport(DEBUG2,
> > + (errmsg("number of free parallel autovacuum workers is set
> > to %u due to config reload",
> > + AutoVacuumShmem->av_freeParallelWorkers),
> > + errhidecontext(true)));
> >
> > Why do we need to add errhidecontext(true) here?
> >
>
> I thought we don't need to write redundant info to the logfile. But I
> don't see that other DEBUG2 messages are hiding context, so
> I'll remove it.
>
> BTW, do we want to use "elog" here too?
+1
Here are some comments:
* 0001 patch:
* of the worker list (see above).
@@ -299,6 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
We should use int32 instead of uint32.
* 0003 patch:
I've attached the proposed changes to the 0003 patch, which includes:
- removal of VacuumCostParams as it's not necessary.
- comment updates.
- other cosmetic updates.
* 0004 patch:
+#ifdef USE_INJECTION_POINTS
+ /*
+ * If we are parallel autovacuum worker, we can consume delay parameters
+ * during index processing (via vacuum_delay_point call). This logging
+ * allows tests to ensure this.
+ */
+ if (shared->is_autovacuum)
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d,
cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d,
cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+#endif
While it's true that we use these logs only during the regression
tests that are enabled only when injection points are also enabled,
these logs themselves are not related to the injection points. I'd
recommend writing these logs when the worker refreshes its local delay
parameters (i.e., in parallel_vacuum_update_shared_delay_params()).
---
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
It would be better to set log_autovacuum_min_duration = 0 to the
specific table instead of setting globally.
---
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_UINT32(nfree_workers);
+}
As I commented above, I think we should use int32 for the number of
parallel free workers. So let's change it here too.
---
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+
I think we don't necessarily need to check the USE_INJECTION_POINTS in
this function as we already have the check in the tap tests. The
function itself is actually workable even without injection points.
---
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
Please update the copyright year here too.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-patch] change_0003_masahiko.patch (10.1K, 2-change_0003_masahiko.patch)
download | inline diff:
commit b269c5eb3ca039f3a9b7b59878b1a575a97ba607
Author: Masahiko Sawada <[email protected]>
Date: Mon Mar 2 14:05:02 2026 -0800
Proposed changes for 0003 patch.
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 70882544d05..644739483c8 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2436,10 +2436,8 @@ vacuum_delay_point(bool is_analyze)
if (IsParallelWorker())
{
/*
- * Possibly update cost-based delay parameters.
- *
- * Do it before checking VacuumCostActive, because its value might be
- * changed after calling this function.
+ * Update cost-based delay parameters for a parallel autovacuum worker
+ * if any changes are detected.
*/
parallel_vacuum_update_shared_delay_params();
}
@@ -2460,8 +2458,8 @@ vacuum_delay_point(bool is_analyze)
VacuumUpdateCosts();
/*
- * If we are parallel autovacuum leader and some of cost-based
- * parameters had changed, let other parallel workers know.
+ * Propagate cost-based parameters to shared memory if any of them
+ * have changed during the config reload.
*/
parallel_vacuum_propagate_shared_delay_params();
}
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 80b57bf9da3..c411ded2e7f 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based delay parameters
+ * from the leader to its workers, as the leader's parameters can change
+ * even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -54,26 +61,6 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
-/*
- * Helper for the PVSharedCostParams structure (see below), to avoid
- * repetition.
- */
-typedef struct VacuumCostParams
-{
- double cost_delay;
- int cost_limit;
- int cost_page_dirty;
- int cost_page_hit;
- int cost_page_miss;
-} VacuumCostParams;
-
-#define FillVacCostParams(cost_params) \
- (cost_params)->cost_delay = vacuum_cost_delay; \
- (cost_params)->cost_limit = vacuum_cost_limit; \
- (cost_params)->cost_page_dirty = VacuumCostPageDirty; \
- (cost_params)->cost_page_hit = VacuumCostPageHit; \
- (cost_params)->cost_page_miss = VacuumCostPageMiss
-
/*
* Struct for cost-based vacuum delay related parameters to share among an
* autovacuum worker and its parallel vacuum workers.
@@ -81,23 +68,22 @@ typedef struct VacuumCostParams
typedef struct PVSharedCostParams
{
/*
- * Each time leader worker updates its parameters, it must increase
- * generation. Every parallel worker keeps the generation
- * (shared_params_local_generation) at which it had last time received
- * parameters from the leader.
- *
- * It is enough for worker to compare it's local_generation with the field
- * below to determine whether it needs to receive new parameters' values.
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based parameters. Paralell vacuum workers
+ * compare this with their local generation,
+ * shared_params_generation_localto, detect if they need to refresh their
+ * local parameter copies.
*/
pg_atomic_uint32 generation;
slock_t mutex; /* protects all fields below */
- /*
- * Copies of the corresponding cost-based vacuum delay parameters from
- * autovacuum leader process.
- */
- VacuumCostParams params_data;
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
} PVSharedCostParams;
/*
@@ -285,7 +271,7 @@ struct ParallelVacuumState
static PVSharedCostParams *pv_shared_cost_params = NULL;
-/* See comments for the PVSharedCostParams structure for the explanation. */
+/* See comments in the PVSharedCostParams for deatils */
static uint32 shared_params_generation_local = 0;
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
@@ -299,6 +285,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -461,9 +448,10 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+ /* Initialize shared cost-based parameters if it's for autovacuum */
if (shared->is_autovacuum)
{
- FillVacCostParams(&shared->cost_params.params_data);
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
pg_atomic_init_u32(&shared->cost_params.generation, 0);
SpinLockInit(&shared->cost_params.mutex);
@@ -615,10 +603,21 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
}
/*
- * If we are parallel *autovacuum* worker, check whether related to cost-based
- * vacuum delay parameters had changed in the leader worker. If so,
- * corresponding parameters will be updated to the values which leader worker
- * is operating on.
+ * Set cost-based delay parameter values to the given 'params'.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel vacuum workers
+ * launched by an autovacuum worker.
*
* For non-autovacuum parallel worker this function will have no effect.
*/
@@ -629,7 +628,7 @@ parallel_vacuum_update_shared_delay_params(void)
Assert(IsParallelWorker());
- /* Check whether we are running parallel autovacuum */
+ /* Quick return if the wokrer is not running for the autovacuum */
if (pv_shared_cost_params == NULL)
return;
@@ -641,13 +640,11 @@ parallel_vacuum_update_shared_delay_params(void)
return;
SpinLockAcquire(&pv_shared_cost_params->mutex);
-
- VacuumCostDelay = pv_shared_cost_params->params_data.cost_delay;
- VacuumCostLimit = pv_shared_cost_params->params_data.cost_limit;
- VacuumCostPageDirty = pv_shared_cost_params->params_data.cost_page_dirty;
- VacuumCostPageHit = pv_shared_cost_params->params_data.cost_page_hit;
- VacuumCostPageMiss = pv_shared_cost_params->params_data.cost_page_miss;
-
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
SpinLockRelease(&pv_shared_cost_params->mutex);
VacuumUpdateCosts();
@@ -656,46 +653,41 @@ parallel_vacuum_update_shared_delay_params(void)
}
/*
- * Function to be called from parallel autovacuum leader in order to propagate
- * some cost-based vacuum delay parameters to the supportive workers.
+ * Store the cost-based vacuum delay parameters on the shared memory so that
+ * parallel vacuum workers can reflect them (see
+ * parallel_vacuum_update_shared_delay_params()).
*/
void
parallel_vacuum_propagate_shared_delay_params(void)
{
- VacuumCostParams *params_data;
-
Assert(AmAutoVacuumWorkerProcess());
- /* Check whether we are running parallel autovacuum */
+ /*
+ * Quick return if the leader process is not shareing the delay
+ * parameters.
+ */
if (pv_shared_cost_params == NULL)
return;
/*
- * Only leader worker can modify this shared structure, so we can read it
- * without acquiring a lock.
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
*/
- params_data = &pv_shared_cost_params->params_data;
-
- if (vacuum_cost_delay == params_data->cost_delay &&
- vacuum_cost_limit == params_data->cost_limit &&
- VacuumCostPageDirty == params_data->cost_page_dirty &&
- VacuumCostPageHit == params_data->cost_page_hit &&
- VacuumCostPageMiss == params_data->cost_page_miss)
- {
- /*
- * We don't need to update shared cost-based vacuum delay params if
- * they haven't changed.
- */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
return;
- }
+ /* Update the shared delay parameters */
SpinLockAcquire(&pv_shared_cost_params->mutex);
- FillVacCostParams(&pv_shared_cost_params->params_data);
+ parallel_vacuum_set_cost_parameters(&pv_shared_cost_params);
SpinLockRelease(&pv_shared_cost_params->mutex);
/*
- * Increase generation of the parameters, i.e. let parallel workers know
- * that they should re-read shared cost params.
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
*/
pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index de9f576e0f3..1120646f2c8 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3250,7 +3250,6 @@ VacAttrStatsP
VacDeadItemsInfo
VacErrPhase
VacOptValue
-VacuumCostParams
VacuumParams
VacuumRelation
VacuumStmt
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-04 06:58 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-04 06:58 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Mar 3, 2026 at 5:26 AM Masahiko Sawada <[email protected]> wrote:
>
> On Sun, Mar 1, 2026 at 6:46 AM Daniil Davydov <[email protected]> wrote:
> >
> > Thus, a/v leader cannot launch any workers if max_parallel_workers is set to 0.
>
> Right. But this fact would actually support that limiting
> autovacuum_max_parallel_workers by max_parallel_workers is more
> appropriate, no?
>
av_max_parallel_workers is really limited by max_parallel_workers only
during shmem init. After that we can change it to a value that is higher
than max_parallel_workers, and nothing bad will happen (obviously).
So, my point was : why should we have this explicit limitation if it
1) doesn't guard us from something bad and 2) can be violated at any time
(via ALTER SYSTEM SET ...).
Now it seems to me that limiting our parameter by max_parallel_workers is
more about grouping of logically related parameters, not a practical necessity.
> > Even if there is a bug in the code and a/v leader cannot release parallel
> > workers due to occured error, one day it will finish vacuuming and call
> > "proc_exit". During "proc_exit" the "before_shmem_exit_hook" along with
> > the "ReleaseAllParallelWorkers" will be called.
>
> What bugs are you concerned about in this case? I'm not sure what you
> meant by "a/v leader cannot release parallel workers due to occured
> error". It sounds like you mentioned a case where there is a bug in
> AutoVacuumReleaseParallelWorkers() but if there is the bug and the
> leader failed to release parallel workers, we would end up not writing
> these elogs in either case.
>
Not precisely. I mean a bug that causes a/v leader to not call
AutoVacuumReleaseParallelWorkers in the try/catch block.
I'll continue my thoughts below.
> > I suppose to do the same as we did for try/catch block - add logging inside
> > the "autovacuum_worker_before_shmem_exit" with some unique message.
> > Thus, we will be sure that the workers are released precisely in the
> > "before_shmem_exit_hook".
> >
> > The alternative is to pass some additional information to the
> > "ReleaseAllParallelWorkers" function (to supplement the log it emits), but it
> > doesn't seem like a good solution to me.
>
> I'm not sure if it's important to check how
> AutoVacuumReleaseAllParallelWorkers() has been called (either in
> PG_CATCH() block or by autovacuum_worker_before_shmem_exit()). We
> would end up having to add a unique message to each caller of
> AutoVacuumReleaseAllParallelWorkers() in the future. I guess it's more
> important to make sure that all workers have been released in the end.
>
> In that sense, it would make more sense to check that all workers have
> actually been released (i.e., checking by
> get_parallel_autovacuum_free_workers()) after a parallel vacuum
> instead of checking workers being released by debug logs. That is, we
> can check at each test end if get_parallel_autovacuum_free_workers()
> returns the expected number after disabling parallel autovacuum.
>
Sure, at first we want to check whether all workers have been
released. But the ability to release them precisely in the try/catch
block is also important, because if it doesn't - a/v worker can "hold"
these workers until it finishes vacuuming of other tables (which can
take a lot of time). Such a situation will surely degrade performance,
so I think that we must check whether we can release workers precisely
during ERROR handling. Do you agree with it?
I understand your concerns about adding a unique log message for each
ReleaseAll call. But I cannot imagine a new situation when we need to
emergency release workers. If you think that it might be possible, I can
propose adding a new optional parameter to the "ReleaseAll" function -
something like "char *context_msg", which will be added to the elog placed
inside this function.
> On second thoughts on the "planned" and "reserved", can we consider
> what the patch implemented as "reserved" as the "planned" in
> autovacuum cases? That is, in autovacuum cases, the "planned" number
> considers the number of parallel degrees based on the number of
> indexes (or autovacuum_parallel_workers value) as well as the number
> of workers that have actually been reserved. In cases of
> autovacuum_max_parallel_workers shortage, users would notice by seeing
> logs that enough workers are not planned in the first place against
> the number of indexes on the table. That might be less confusing for
> users rather than introducing a new "reserved" concept in the vacuum
> logs. Also, it slightly helps simplify the codes.
Yeah, it sounds tempting. But in this case we're shifting more responsibility
to the user. For instance :
If av_max_workers = 5 and there are two a/v leaders each of which is trying
to launch 3 parallel workers, we will see logs like "3 planned, 3 launched",
"2 planned, 2 launched". IMHO, such a log doesn't imply that there is a
shortage of workers. I.e. this is the user's responsibility to notice that the
second a/v leader could launch more than 2 workers for processing of the
table with (N + 2) indexes.
In this case even our previous version of logging will give more information
to the user : "3 planned, 3 launched", "3 planned, 2 launched".
If we don't want to create a new "reserved" concept, maybe we can rename
it to something more intuitive? For example, "n_abandoned" - number of
workers that we were unable to launch due to av_max_parallel_workers
shortage. If n_abandoned is 0 and n_launched < n_planned, the user can
conclude that he should increase the max_parallel_workers parameter.
And vica versa, if n_launched == n_planned and n_abandoned > 0, the
user can conclude that he should increase the
autovacuum_max_parallel_workers parameter.
What do you think?
**Comments on the 0001 patch**
> * of the worker list (see above).
> @@ -299,6 +308,8 @@ typedef struct
> WorkerInfo av_startingWorker;
> AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
> pg_atomic_uint32 av_nworkersForBalance;
> + uint32 av_freeParallelWorkers;
> + uint32 av_maxParallelWorkers;
> } AutoVacuumShmemStruct;
>
> We should use int32 instead of uint32.
I don't mind, but I don't quite understand the reason. We assume that the
minimal value for both variables is 0. Why shouldn't we use unsigned
data type?
**Comments on the 0003 patch**
> I've attached the proposed changes to the 0003 patch, which includes:
>
> - removal of VacuumCostParams as it's not necessary.
> - comment updates.
> - other cosmetic updates.
Thank you! Most of the proposals are LGTM, but I'll edit a few comments.
**Comments on the 0004 patch**
> +#ifdef USE_INJECTION_POINTS
> + /*
> + * If we are parallel autovacuum worker, we can consume delay parameters
> + * during index processing (via vacuum_delay_point call). This logging
> + * allows tests to ensure this.
> + */
> + if (shared->is_autovacuum)
> + elog(DEBUG2,
> + "parallel autovacuum worker cost params: cost_limit=%d,
> cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d,
> cost_page_hit=%d",
> + vacuum_cost_limit,
> + vacuum_cost_delay,
> + VacuumCostPageMiss,
> + VacuumCostPageDirty,
> + VacuumCostPageHit);
> +#endif
>
> While it's true that we use these logs only during the regression
> tests that are enabled only when injection points are also enabled,
> these logs themselves are not related to the injection points. I'd
> recommend writing these logs when the worker refreshes its local delay
> parameters (i.e., in parallel_vacuum_update_shared_delay_params()).
>
I agree (thought about it too).
> +$node->append_conf('postgresql.conf', qq{
> + max_worker_processes = 20
> + max_parallel_workers = 20
> + max_parallel_maintenance_workers = 20
> + autovacuum_max_parallel_workers = 20
> + log_min_messages = debug2
> + log_autovacuum_min_duration = 0
> + autovacuum_naptime = '1s'
> + min_parallel_index_scan_size = 0
> + shared_preload_libraries=test_autovacuum
> +});
>
> It would be better to set log_autovacuum_min_duration = 0 to the
> specific table instead of setting globally.
>
I agree.
> + uint32 nfree_workers;
> +
> +#ifndef USE_INJECTION_POINTS
> + ereport(ERROR, errmsg("injection points not supported"));
> +#endif
> +
> + nfree_workers = AutoVacuumGetFreeParallelWorkers();
> +
> + PG_RETURN_UINT32(nfree_workers);
> +}
>
> As I commented above, I think we should use int32 for the number of
> parallel free workers. So let's change it here too.
No problem. But again, why do we avoid unsigned integer?
> +PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
> +Datum
> +get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
> +{
> + uint32 nfree_workers;
> +
> +#ifndef USE_INJECTION_POINTS
> + ereport(ERROR, errmsg("injection points not supported"));
> +#endif
> +
>
> I think we don't necessarily need to check the USE_INJECTION_POINTS in
> this function as we already have the check in the tap tests. The
> function itself is actually workable even without injection points.
>
I agree. It is left from the previous tests implementation.
> +# Copyright (c) 2024-2025, PostgreSQL Global Development Group
> +
>
> Please update the copyright year here too.
I keep forgetting about the meson file, sorry.
Thank you very much for the review!
Please, see an updated set of patches.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v24-0004-Tests-for-parallel-autovacuum.patch (20.7K, 2-v24-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From df141f9e4588ca45e8430d3accf55f4cfe3d3a9f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v24 4/5] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 22 ++
src/backend/postmaster/autovacuum.c | 38 +++
src/include/postmaster/autovacuum.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 28 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../t/001_parallel_autovacuum.pl | 299 ++++++++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 +
.../modules/test_autovacuum/test_autovacuum.c | 31 ++
.../test_autovacuum/test_autovacuum.control | 3 +
13 files changed, 483 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2bcdbdcfcf3..4a3b826dde5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -872,6 +873,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 13304c40b59..82618ab3ac5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -47,6 +47,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -653,6 +654,14 @@ parallel_vacuum_update_shared_delay_params(void)
VacuumUpdateCosts();
shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
/*
@@ -919,6 +928,19 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ *
+ * This injection point is also used to wait until parallel workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index cc3456e205d..1c51210883e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1409,7 +1409,18 @@ avl_sigusr2_handler(SIGNAL_ARGS)
static void
autovacuum_worker_before_shmem_exit(int code, Datum arg)
{
+ int nreserved_old = av_nworkers_reserved;
+
AutoVacuumReleaseAllParallelWorkers();
+
+ if (nreserved_old > 0)
+ {
+ elog(DEBUG2,
+ ngettext("autovacuum worker before_shmem_exit: %d parallel worker has been released",
+ "autovacuum worker before_shmem_exit: %d parallel workers has been released",
+ nreserved_old - av_nworkers_reserved),
+ nreserved_old - av_nworkers_reserved);
+ }
}
/*
@@ -2495,12 +2506,20 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ int nreserved_workers = av_nworkers_reserved;
+
/*
* Parallel autovacuum can reserve parallel workers. Make sure
* that all reserved workers are released.
*/
AutoVacuumReleaseAllParallelWorkers();
+ if (nreserved_workers > 0)
+ ereport(DEBUG2,
+ (errmsg("%d parallel autovacuum workers has been released after occured error",
+ nreserved_workers),
+ errhidecontext(true)));
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -3465,6 +3484,21 @@ AutoVacuumReleaseAllParallelWorkers(void)
Assert(av_nworkers_reserved == 0);
}
+/*
+ * Get number of free autovacuum parallel workers.
+ */
+int32
+AutoVacuumGetFreeParallelWorkers(void)
+{
+ int32 nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_SHARED);
+ nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
+ LWLockRelease(AutovacuumLock);
+
+ return nfree_workers;
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3633,5 +3667,9 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+ elog(DEBUG2,
+ "number of free parallel autovacuum workers is set to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index f3783afb51b..d60010a43b4 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
extern void AutoVacuumReleaseAllParallelWorkers(void);
+extern int32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4ac5c84db43..01fe0041c97 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e2b3eef4136..9dcdc68bc87 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..32254c53a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..969af8bd52a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..7f8b5a7b4d3
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,299 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ });
+
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+}
+
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$log_start = $node->wait_for_log(
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ $log_start
+);
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 20, 'All parallel workers has been released by the leader');
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# Test 3:
+# Test adjustment of free parallel workers number when changing
+# autovacuum_max_parallel_workers parameter
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 1;
+ SELECT pg_reload_conf();
+});
+
+# Since 2 parallel workers already launched and will be released in the future,
+# we are expecting that :
+# 1) number of free workers will be '0' after config reload
+# 2) number of free workers will be '1' after releasing workers
+
+# Check statement (1)
+$log_start = $node->wait_for_log(
+ qr/number of free parallel autovacuum workers is set to 0 due to config reload/,
+ $log_start
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until the end of parallel processing
+$log_start = $node->wait_for_log(
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ $log_start
+);
+
+# Check statement (2)
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 1, 'Number of free parallel workers is consistent');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
+ SELECT pg_reload_conf();
+});
+
+# Test 4:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exits due to an ERROR.
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'error');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$log_start = $node->wait_for_log(
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/,
+ $log_start
+);
+
+$log_start = $node->wait_for_log(
+ qr/2 parallel autovacuum workers has been released after occured error/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 5:
+# Same as above test, but simulate situation, when leader exits due to FATAL.
+
+prepare_for_next_test($node, 5);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until parallel workers are reserved autovacuum and kill the leader
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+my $av_pid = $node->safe_psql('postgres', qq{
+ SELECT pid FROM pg_stat_activity
+ WHERE backend_type = 'autovacuum worker'
+ AND wait_event = 'autovacuum-leader-before-indexes-processing'
+ LIMIT 1;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
+
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker before_shmem_exit: 2 parallel workers has been released/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..e5646e0def5
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting shared autovacuum state
+ */
+
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..dd5c839e851
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "postmaster/autovacuum.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ int32 nfree_workers;
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_INT32(nfree_workers);
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v24-0003-Cost-based-parameters-propagation-for-parallel-a.patch (10.4K, 3-v24-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From 1b99783b4be5909cd5d168f5e019a5d3e2a2118c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v24 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 21 +++-
src/backend/commands/vacuumparallel.c | 157 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 1 +
5 files changed, 180 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index b9840637783..5fba48d0536 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2434,8 +2434,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2449,6 +2460,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 643849b2fb8..13304c40b59 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -54,6 +61,31 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -123,6 +155,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -225,6 +269,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments in the PVSharedCostParams for the details */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -236,6 +285,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -396,6 +446,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -540,6 +605,95 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
&wusage->cleanup);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the wokrer is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1109,6 +1263,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 267fdcbe1a8..cc3456e205d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1690,7 +1690,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1b1fb625cb2..4bfeba8264d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -434,6 +434,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 536237ff546..1120646f2c8 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2070,6 +2070,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkersUsage
PVWorkersStats
PX_Alias
--
2.43.0
[text/x-patch] v24-0002-Logging-for-parallel-autovacuum.patch (10.2K, 4-v24-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From ff36d0daf6abb1d74370111a18762643e417aba8 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v24 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 54 ++++++++++++++++++++++++++-
src/backend/commands/vacuumparallel.c | 32 +++++++++++++---
src/include/commands/vacuum.h | 39 ++++++++++++++++++-
src/tools/pgindent/typedefs.list | 2 +
4 files changed, 117 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5b6f2441f6b..2bcdbdcfcf3 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -342,6 +342,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -780,6 +787,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1122,6 +1134,42 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ if (AmAutoVacuumWorkerProcess())
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d reserved, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nreserved,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ else
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ }
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ if (AmAutoVacuumWorkerProcess())
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d reserved, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nreserved,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ else
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2668,7 +2716,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3102,7 +3151,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 806a7f48326..643849b2fb8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -228,7 +228,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -503,7 +503,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -514,7 +514,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -522,7 +523,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -534,7 +536,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -616,10 +619,13 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the statistics it stores will be updated according
+ * to what happens during function execution.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -656,13 +662,23 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
*/
if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ {
AutoVacuumReserveParallelWorkers(&nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nreserved += nworkers;
+ }
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -729,6 +745,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..1b1fb625cb2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,39 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ */
+typedef struct PVWorkersStats
+{
+ /* Number of parallel workers we are planned to launch */
+ int nplanned;
+
+ /*
+ * Number of parallel workers we have managed to reserve.
+ *
+ * Note, that we collect this stats only for the parallel *autovacuum*
+ * since during it we must reserve workers in shared state before actually
+ * trying to launch them (in order to meet the
+ * autovacuum_max_parallel_workers limit). Manual VACUUM (PARALLEL), on
+ * the contrary, doesn't need to reserve workers.
+ */
+ int nreserved;
+
+ /* Number of launched parallel workers */
+ int nlaunched;
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched, reserved
+ * and planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +427,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 77e3c04144e..536237ff546 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2070,6 +2070,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v24-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 5-v24-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 84d78c58932bb1d9f1bf01319a583e68278e7bca Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v24 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f670e2d4c31..07139ec7ff2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9380,6 +9381,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-parallel-workers"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..4894de021cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v24-0001-Parallel-autovacuum.patch (19.4K, 6-v24-0001-Parallel-autovacuum.patch)
download | inline diff:
From 3222e8734acb39452f9f2e8c96960cfac99dff5d Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v24 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 8 +
11 files changed, 240 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..9459a010cc3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..806a7f48326 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
@@ -374,8 +377,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -554,12 +558,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -598,8 +607,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -647,6 +656,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -691,6 +707,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -739,6 +765,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseAllParallelWorkers();
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6fde740465f..267fdcbe1a8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,13 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +292,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ int32 av_freeParallelWorkers;
+ int32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -361,6 +372,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -759,6 +771,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -775,6 +789,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1379,6 +1402,16 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -2275,6 +2308,12 @@ do_autovacuum(void)
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that all
+ * reserved workers are released even after FATAL error.
+ */
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Perform operations on collected tables.
*/
@@ -2456,6 +2495,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2856,8 +2901,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3334,6 +3383,88 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Reserves parallel workers for autovacuum.
+ *
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
+ *
+ * The caller must call AutoVacuumRelease[All]ParallelWorkers() to release the
+ * reserved workers.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader autovacuum worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* The worker must not have any reserved workers yet */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+}
+
+/*
+ * Releases the reserved parallel workers for autovacuum.
+ *
+ * This function should be used to release the parallel workers that an
+ * autovacuum worker reserved by AutoVacuumReserveParallelWorkers(). nworkers
+ * is the number of workers to release, which must not be greater than the
+ * number of workers currently reserved, av_nworkers_reserved.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Cannot release more workers than reserved */
+ Assert(nworkers <= av_nworkers_reserved);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+}
+
+/*
+ * Same as above, but this function releases all the parallel workers that
+ * this autovacuum worker reserved.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3394,6 +3525,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_parallel_workers);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3475,3 +3610,28 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ int nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d77502838c4..4a5c73a9e33 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 9507778415d..92b69c65e83 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f938cc65a3a..ef8126f3790 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -710,6 +710,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 905c076763c..31ec2f51753 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..f3783afb51b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,6 +62,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..11dd3aebc6c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,14 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Target number of parallel autovacuum workers. -1 by default disables
+ * parallel vacuum during autovacuum. 0 means choose the parallel degree
+ * based on the number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v23--v24-diff-for-0004.patch (5.4K, 7-v23--v24-diff-for-0004.patch)
download | inline diff:
From d6add90f5146fe0acae78fbcf72d9559b21c9305 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 4 Mar 2026 13:39:03 +0700
Subject: [PATCH] fixes for 0004
---
src/backend/commands/vacuumparallel.c | 24 +++++++------------
src/backend/postmaster/autovacuum.c | 4 ++--
src/include/postmaster/autovacuum.h | 2 +-
src/test/modules/test_autovacuum/meson.build | 2 +-
.../t/001_parallel_autovacuum.pl | 4 ++--
.../modules/test_autovacuum/test_autovacuum.c | 8 ++-----
6 files changed, 16 insertions(+), 28 deletions(-)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 828844ffc67..414a465d99f 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -654,6 +654,14 @@ parallel_vacuum_update_shared_delay_params(void)
VacuumUpdateCosts();
shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
/*
@@ -1311,22 +1319,6 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
-#ifdef USE_INJECTION_POINTS
- /*
- * If we are parallel autovacuum worker, we can consume delay parameters
- * during index processing (via vacuum_delay_point call). This logging
- * allows tests to ensure this.
- */
- if (shared->is_autovacuum)
- elog(DEBUG2,
- "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
- vacuum_cost_limit,
- vacuum_cost_delay,
- VacuumCostPageMiss,
- VacuumCostPageDirty,
- VacuumCostPageHit);
-#endif
-
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ee8d9ba0428..1c51210883e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3487,10 +3487,10 @@ AutoVacuumReleaseAllParallelWorkers(void)
/*
* Get number of free autovacuum parallel workers.
*/
-uint32
+int32
AutoVacuumGetFreeParallelWorkers(void)
{
- uint32 nfree_workers;
+ int32 nfree_workers;
LWLockAcquire(AutovacuumLock, LW_SHARED);
nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 52be260e15f..d60010a43b4 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,7 +66,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
extern void AutoVacuumReleaseAllParallelWorkers(void);
-extern uint32 AutoVacuumGetFreeParallelWorkers(void);
+extern int32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
index 75b24814b13..969af8bd52a 100644
--- a/src/test/modules/test_autovacuum/meson.build
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -1,4 +1,4 @@
-# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
test_autovacuum_sources = files(
'test_autovacuum.c',
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index edfbde73aac..7f8b5a7b4d3 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -40,7 +40,6 @@ $node->append_conf('postgresql.conf', qq{
max_parallel_maintenance_workers = 20
autovacuum_max_parallel_workers = 20
log_min_messages = debug2
- log_autovacuum_min_duration = 0
autovacuum_naptime = '1s'
min_parallel_index_scan_size = 0
shared_preload_libraries=test_autovacuum
@@ -70,7 +69,8 @@ $node->safe_psql('postgres', qq{
CREATE TABLE test_autovac (
id SERIAL PRIMARY KEY,
col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
- ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
INSERT INTO test_autovac
SELECT
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
index 195a6149a5d..dd5c839e851 100644
--- a/src/test/modules/test_autovacuum/test_autovacuum.c
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -23,13 +23,9 @@ PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
Datum
get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
{
- uint32 nfree_workers;
-
-#ifndef USE_INJECTION_POINTS
- ereport(ERROR, errmsg("injection points not supported"));
-#endif
+ int32 nfree_workers;
nfree_workers = AutoVacuumGetFreeParallelWorkers();
- PG_RETURN_UINT32(nfree_workers);
+ PG_RETURN_INT32(nfree_workers);
}
--
2.43.0
[text/x-patch] v23--v24-diff-for-0003.patch (10.4K, 8-v23--v24-diff-for-0003.patch)
download | inline diff:
From 848d628a56d78b38b21b5a83d1f63e03075171af Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 4 Mar 2026 02:50:39 +0700
Subject: [PATCH] fixes for 0003
---
src/backend/commands/vacuum.c | 10 +-
src/backend/commands/vacuumparallel.c | 132 ++++++++++++--------------
src/tools/pgindent/typedefs.list | 1 -
3 files changed, 67 insertions(+), 76 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e94e35481a2..5fba48d0536 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2440,10 +2440,8 @@ vacuum_delay_point(bool is_analyze)
if (IsParallelWorker())
{
/*
- * Possibly update cost-based delay parameters.
- *
- * Do it before checking VacuumCostActive, because its value might be
- * changed after calling this function.
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
*/
parallel_vacuum_update_shared_delay_params();
}
@@ -2464,8 +2462,8 @@ vacuum_delay_point(bool is_analyze)
VacuumUpdateCosts();
/*
- * If we are parallel autovacuum leader and some of cost-based
- * parameters had changed, let other parallel workers know.
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
*/
parallel_vacuum_propagate_shared_delay_params();
}
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 80b57bf9da3..13304c40b59 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -54,26 +61,6 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
-/*
- * Helper for the PVSharedCostParams structure (see below), to avoid
- * repetition.
- */
-typedef struct VacuumCostParams
-{
- double cost_delay;
- int cost_limit;
- int cost_page_dirty;
- int cost_page_hit;
- int cost_page_miss;
-} VacuumCostParams;
-
-#define FillVacCostParams(cost_params) \
- (cost_params)->cost_delay = vacuum_cost_delay; \
- (cost_params)->cost_limit = vacuum_cost_limit; \
- (cost_params)->cost_page_dirty = VacuumCostPageDirty; \
- (cost_params)->cost_page_hit = VacuumCostPageHit; \
- (cost_params)->cost_page_miss = VacuumCostPageMiss
-
/*
* Struct for cost-based vacuum delay related parameters to share among an
* autovacuum worker and its parallel vacuum workers.
@@ -81,23 +68,22 @@ typedef struct VacuumCostParams
typedef struct PVSharedCostParams
{
/*
- * Each time leader worker updates its parameters, it must increase
- * generation. Every parallel worker keeps the generation
- * (shared_params_local_generation) at which it had last time received
- * parameters from the leader.
- *
- * It is enough for worker to compare it's local_generation with the field
- * below to determine whether it needs to receive new parameters' values.
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
*/
pg_atomic_uint32 generation;
slock_t mutex; /* protects all fields below */
- /*
- * Copies of the corresponding cost-based vacuum delay parameters from
- * autovacuum leader process.
- */
- VacuumCostParams params_data;
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
} PVSharedCostParams;
/*
@@ -285,7 +271,7 @@ struct ParallelVacuumState
static PVSharedCostParams *pv_shared_cost_params = NULL;
-/* See comments for the PVSharedCostParams structure for the explanation. */
+/* See comments in the PVSharedCostParams for the details */
static uint32 shared_params_generation_local = 0;
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
@@ -299,6 +285,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -461,9 +448,13 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
if (shared->is_autovacuum)
{
- FillVacCostParams(&shared->cost_params.params_data);
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
pg_atomic_init_u32(&shared->cost_params.generation, 0);
SpinLockInit(&shared->cost_params.mutex);
@@ -615,10 +606,21 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
}
/*
- * If we are parallel *autovacuum* worker, check whether related to cost-based
- * vacuum delay parameters had changed in the leader worker. If so,
- * corresponding parameters will be updated to the values which leader worker
- * is operating on.
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
*
* For non-autovacuum parallel worker this function will have no effect.
*/
@@ -629,7 +631,7 @@ parallel_vacuum_update_shared_delay_params(void)
Assert(IsParallelWorker());
- /* Check whether we are running parallel autovacuum */
+ /* Quick return if the wokrer is not running for the autovacuum */
if (pv_shared_cost_params == NULL)
return;
@@ -641,13 +643,11 @@ parallel_vacuum_update_shared_delay_params(void)
return;
SpinLockAcquire(&pv_shared_cost_params->mutex);
-
- VacuumCostDelay = pv_shared_cost_params->params_data.cost_delay;
- VacuumCostLimit = pv_shared_cost_params->params_data.cost_limit;
- VacuumCostPageDirty = pv_shared_cost_params->params_data.cost_page_dirty;
- VacuumCostPageHit = pv_shared_cost_params->params_data.cost_page_hit;
- VacuumCostPageMiss = pv_shared_cost_params->params_data.cost_page_miss;
-
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
SpinLockRelease(&pv_shared_cost_params->mutex);
VacuumUpdateCosts();
@@ -656,46 +656,40 @@ parallel_vacuum_update_shared_delay_params(void)
}
/*
- * Function to be called from parallel autovacuum leader in order to propagate
- * some cost-based vacuum delay parameters to the supportive workers.
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
*/
void
parallel_vacuum_propagate_shared_delay_params(void)
{
- VacuumCostParams *params_data;
-
Assert(AmAutoVacuumWorkerProcess());
- /* Check whether we are running parallel autovacuum */
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
if (pv_shared_cost_params == NULL)
return;
/*
- * Only leader worker can modify this shared structure, so we can read it
- * without acquiring a lock.
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
*/
- params_data = &pv_shared_cost_params->params_data;
-
- if (vacuum_cost_delay == params_data->cost_delay &&
- vacuum_cost_limit == params_data->cost_limit &&
- VacuumCostPageDirty == params_data->cost_page_dirty &&
- VacuumCostPageHit == params_data->cost_page_hit &&
- VacuumCostPageMiss == params_data->cost_page_miss)
- {
- /*
- * We don't need to update shared cost-based vacuum delay params if
- * they haven't changed.
- */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
return;
- }
+ /* Update the shared delay parameters */
SpinLockAcquire(&pv_shared_cost_params->mutex);
- FillVacCostParams(&pv_shared_cost_params->params_data);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
SpinLockRelease(&pv_shared_cost_params->mutex);
/*
- * Increase generation of the parameters, i.e. let parallel workers know
- * that they should re-read shared cost params.
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
*/
pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index de9f576e0f3..1120646f2c8 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3250,7 +3250,6 @@ VacAttrStatsP
VacDeadItemsInfo
VacErrPhase
VacOptValue
-VacuumCostParams
VacuumParams
VacuumRelation
VacuumStmt
--
2.43.0
[text/x-patch] v23--v24-diff-for-0001.patch (877B, 9-v23--v24-diff-for-0001.patch)
download | inline diff:
From e2c7a74a110941ff86e7aabb85aa23fccbcfde5b Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 4 Mar 2026 02:19:03 +0700
Subject: [PATCH] fixes for 0001
---
src/backend/postmaster/autovacuum.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f40abe90ed5..267fdcbe1a8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -308,8 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
- uint32 av_freeParallelWorkers;
- uint32 av_maxParallelWorkers;
+ int32 av_freeParallelWorkers;
+ int32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-10 18:13 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-10 18:13 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Tue, Mar 3, 2026 at 10:59 PM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Tue, Mar 3, 2026 at 5:26 AM Masahiko Sawada <[email protected]> wrote:
> >
> > On Sun, Mar 1, 2026 at 6:46 AM Daniil Davydov <[email protected]> wrote:
> > >
> > > Thus, a/v leader cannot launch any workers if max_parallel_workers is set to 0.
> >
> > Right. But this fact would actually support that limiting
> > autovacuum_max_parallel_workers by max_parallel_workers is more
> > appropriate, no?
> >
>
> av_max_parallel_workers is really limited by max_parallel_workers only
> during shmem init. After that we can change it to a value that is higher
> than max_parallel_workers, and nothing bad will happen (obviously).
>
> So, my point was : why should we have this explicit limitation if it
> 1) doesn't guard us from something bad and 2) can be violated at any time
> (via ALTER SYSTEM SET ...).
>
> Now it seems to me that limiting our parameter by max_parallel_workers is
> more about grouping of logically related parameters, not a practical necessity.
I believe there is also a benefit for users when they want to disable
all parallel behavior. If av_max_parallel_workers is in
max_parallel_worker group, they would have to set just
max_parallel_workers to 0. Otherwise, they would have to set both
max_parallel_workers and av_max_parallel_workers.
>
> > > I suppose to do the same as we did for try/catch block - add logging inside
> > > the "autovacuum_worker_before_shmem_exit" with some unique message.
> > > Thus, we will be sure that the workers are released precisely in the
> > > "before_shmem_exit_hook".
> > >
> > > The alternative is to pass some additional information to the
> > > "ReleaseAllParallelWorkers" function (to supplement the log it emits), but it
> > > doesn't seem like a good solution to me.
> >
> > I'm not sure if it's important to check how
> > AutoVacuumReleaseAllParallelWorkers() has been called (either in
> > PG_CATCH() block or by autovacuum_worker_before_shmem_exit()). We
> > would end up having to add a unique message to each caller of
> > AutoVacuumReleaseAllParallelWorkers() in the future. I guess it's more
> > important to make sure that all workers have been released in the end.
> >
> > In that sense, it would make more sense to check that all workers have
> > actually been released (i.e., checking by
> > get_parallel_autovacuum_free_workers()) after a parallel vacuum
> > instead of checking workers being released by debug logs. That is, we
> > can check at each test end if get_parallel_autovacuum_free_workers()
> > returns the expected number after disabling parallel autovacuum.
> >
>
> Sure, at first we want to check whether all workers have been
> released. But the ability to release them precisely in the try/catch
> block is also important, because if it doesn't - a/v worker can "hold"
> these workers until it finishes vacuuming of other tables (which can
> take a lot of time). Such a situation will surely degrade performance,
> so I think that we must check whether we can release workers precisely
> during ERROR handling. Do you agree with it?
I agree that we need to make sure that parallel workers are released
even during ERROR handling, but I don't think it's important to check
the places where AutoVacuumReleaesAllParallelWorkers() is called, by
using regression tests. It's more important and future-proof that we
check if all workers are released according to the shmem data. In
other words, even if we call AutoVacuumReleaseAllParallelWorkers() in
an unexpected call path in an ERROR case, it's still okay if we
successfully release all workers in the end. These regression tests
should test these database behavior but not what specific code path
taken. If we can check if all workers are released by checking the
shmem, why do we need to check further where they are released?
>
> I understand your concerns about adding a unique log message for each
> ReleaseAll call. But I cannot imagine a new situation when we need to
> emergency release workers. If you think that it might be possible, I can
> propose adding a new optional parameter to the "ReleaseAll" function -
> something like "char *context_msg", which will be added to the elog placed
> inside this function.
I think we should not make the function complex just for testing
purposes. My point is that what we should be testing is the behavior
-- specifically whether parallel workers are released at the expected
timing -- rather than focusing on whether a specific code path was
executed.
>
> > On second thoughts on the "planned" and "reserved", can we consider
> > what the patch implemented as "reserved" as the "planned" in
> > autovacuum cases? That is, in autovacuum cases, the "planned" number
> > considers the number of parallel degrees based on the number of
> > indexes (or autovacuum_parallel_workers value) as well as the number
> > of workers that have actually been reserved. In cases of
> > autovacuum_max_parallel_workers shortage, users would notice by seeing
> > logs that enough workers are not planned in the first place against
> > the number of indexes on the table. That might be less confusing for
> > users rather than introducing a new "reserved" concept in the vacuum
> > logs. Also, it slightly helps simplify the codes.
>
> Yeah, it sounds tempting. But in this case we're shifting more responsibility
> to the user. For instance :
> If av_max_workers = 5 and there are two a/v leaders each of which is trying
> to launch 3 parallel workers, we will see logs like "3 planned, 3 launched",
> "2 planned, 2 launched". IMHO, such a log doesn't imply that there is a
> shortage of workers. I.e. this is the user's responsibility to notice that the
> second a/v leader could launch more than 2 workers for processing of the
> table with (N + 2) indexes.
> In this case even our previous version of logging will give more information
> to the user : "3 planned, 3 launched", "3 planned, 2 launched".
>
> If we don't want to create a new "reserved" concept, maybe we can rename
> it to something more intuitive? For example, "n_abandoned" - number of
> workers that we were unable to launch due to av_max_parallel_workers
> shortage. If n_abandoned is 0 and n_launched < n_planned, the user can
> conclude that he should increase the max_parallel_workers parameter.
> And vica versa, if n_launched == n_planned and n_abandoned > 0, the
> user can conclude that he should increase the
> autovacuum_max_parallel_workers parameter.
>
> What do you think?
While I agree that showing only two numbers might lack some
information for users, I guess the same is true for
max_parallel_maintenance_workers or other parallel queries related to
GUC parameters. For instance, suppose we set
max_parallel_maintenance_workers to 2, if the table has (large enough)
4 indexes, we would plan to execute a parallel vacuum with 2 workers
instead of 4 due to max_parallel_maintenance_worker shortage and it's
even possible that only 1 worker can launch due to
max_worker_processes shortage. In this case, we currently consider
that 2 workers are planned. Isn't it the same situation as the case
where we reserved 2 parallel vacuum workers for autovacuum for the
table with 4 indexes?
>
> **Comments on the 0001 patch**
>
> > * of the worker list (see above).
> > @@ -299,6 +308,8 @@ typedef struct
> > WorkerInfo av_startingWorker;
> > AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
> > pg_atomic_uint32 av_nworkersForBalance;
> > + uint32 av_freeParallelWorkers;
> > + uint32 av_maxParallelWorkers;
> > } AutoVacuumShmemStruct;
> >
> > We should use int32 instead of uint32.
>
> I don't mind, but I don't quite understand the reason. We assume that the
> minimal value for both variables is 0. Why shouldn't we use unsigned
> data type?
Unsigned integers should be used for bit masks, flags, or when we need
to handle more than INT_MAX. Signed integers are preferable in other
cases as we're using signed integers for controlling the number of
workers and autovacuum_max_parallel_workers is defined as signed int
(which could be stored to AutoVacuumShmem->av_maxParallelWorkers).
Here are some review comments.
* 0001 patch:
+ /* Cannot release more workers than reserved */
+ Assert(nworkers <= av_nworkers_reserved);
I think it's better to use Min() to cap the number of workers to be
released by av_nworkers_reserved as Assert() won't work in release
builds.
* 0004 patch:
Can we write the same test cases while not relying on the 0002 patch
(i.e., worker usage logging)? We check the worker usage log at two
places in the regression tests. The idea is that we write the number
of workers planned, reserved, and launched in DEBUG log level and
check these logs in the regression tests. The patch 0001, 0003, and
0004 can be merged before push while we might want more discussion on
the 0002 patch.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-11 11:28 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-11 11:28 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Wed, Mar 11, 2026 at 1:14 AM Masahiko Sawada <[email protected]> wrote:
>
> On Tue, Mar 3, 2026 at 10:59 PM Daniil Davydov <[email protected]> wrote:
> >
> > So, my point was : why should we have this explicit limitation if it
> > 1) doesn't guard us from something bad and 2) can be violated at any time
> > (via ALTER SYSTEM SET ...).
> >
> > Now it seems to me that limiting our parameter by max_parallel_workers is
> > more about grouping of logically related parameters, not a practical necessity.
>
> I believe there is also a benefit for users when they want to disable
> all parallel behavior. If av_max_parallel_workers is in
> max_parallel_worker group, they would have to set just
> max_parallel_workers to 0. Otherwise, they would have to set both
> max_parallel_workers and av_max_parallel_workers.
>
OK, thank you for the explanation!
> I agree that we need to make sure that parallel workers are released
> even during ERROR handling, but I don't think it's important to check
> the places where AutoVacuumReleaesAllParallelWorkers() is called, by
> using regression tests. It's more important and future-proof that we
> check if all workers are released according to the shmem data. In
> other words, even if we call AutoVacuumReleaseAllParallelWorkers() in
> an unexpected call path in an ERROR case, it's still okay if we
> successfully release all workers in the end. These regression tests
> should test these database behavior but not what specific code path
> taken.
Indeed, I can't remember where else in the tests we check the passage
along specific code paths in this way.
> If we can check if all workers are released by checking the
> shmem, why do we need to check further where they are released?
My point of view was that this code path is so important that we need to
test it (important in terms of performance).
But of course even if for some reason we cannot release workers inside
the try/catch block, we can still be sure that they will be released somewhere
else, because we have tested it.
> I think we should not make the function complex just for testing
> purposes. My point is that what we should be testing is the behavior
> -- specifically whether parallel workers are released at the expected
> timing -- rather than focusing on whether a specific code path was
> executed.
You've convinced me :)
I'll add a log to the "ReleaseWorkers" function and tests will only
search for it.
> While I agree that showing only two numbers might lack some
> information for users, I guess the same is true for
> max_parallel_maintenance_workers or other parallel queries related to
> GUC parameters. For instance, suppose we set
> max_parallel_maintenance_workers to 2, if the table has (large enough)
> 4 indexes, we would plan to execute a parallel vacuum with 2 workers
> instead of 4 due to max_parallel_maintenance_worker shortage and it's
> even possible that only 1 worker can launch due to
> max_worker_processes shortage. In this case, we currently consider
> that 2 workers are planned. Isn't it the same situation as the case
> where we reserved 2 parallel vacuum workers for autovacuum for the
> table with 4 indexes?
I don't think that examples with other "max_parallel_" parameters will be
appropriate, because these parameters are limiting the number of parallel
workers for *single* operation/executor node/... . At the same time,
av_max_parallel_workers limits the total number of parallel workers across
all a/v leaders.
Regarding the situation that you provided :
The number of planned workers is reduced inside the
parallel_vacuum_compute_workers due to the max_parallel_maintenance_workers
limit. I.e. we cannot plan more workers than required by the config, and
it's completely OK No one expects the number of "planned workers" to be more
than max_parallel_maintenance_workers.
IMO there is no need to make efforts to track the shortage of
max_parallel_maintenance_workers for the VACUUM (PARALLEL), because this
parameter just plays the role of a limiter. We will consider only the
shortage of max_parallel_workers, that can be determined by looking at
"planned vs. launched".
And here is a difference with a parallel autovacuum :
av_max_parallel_workers is considered twice : in the
"parallel_vacuum_compute_workers" and "ReserveWorkers" functions.
So the low number of launched workers can be explained by the shortage of
both av_max_parallel_workers and max_parallel_workers. Since we want to
distinguish between these cases, we have added the "nreserved" concept.
I see that few modules can report something like "out of background worker
slots" when they cannot launch more workers due to max_parallel_workers
shortage (but modules depending on the "parallel.c" logic don't do so).
This fact gave me another idea :
If we don't want to log "nreserved" or some other similar value, maybe
we should add logging after the "ReserveWorkers" function? I.e. if some
workers cannot be reserved, we can emit a log like "out of parallel
autovacuum workers. you should increase the av_max_parallel_workers
parameter". Having this log can help the user distinguish between
max_parallel_workers/av_max_parallel_workers shortage situations.
What do you think?
Summary :
1)
I think that we should not look at maintenance vacuum while
considering how to inform the user about parameters shortage for autovacuum,
because we have a more complicated situation in case of autovacuum.
2)
I suggest adding a separate log that will be emitted every time we are
unable to start workers due to a shortage of av_max_parallel_workers.
> > I don't mind, but I don't quite understand the reason. We assume that the
> > minimal value for both variables is 0. Why shouldn't we use unsigned
> > data type?
>
> Unsigned integers should be used for bit masks, flags, or when we need
> to handle more than INT_MAX. Signed integers are preferable in other
> cases as we're using signed integers for controlling the number of
> workers and autovacuum_max_parallel_workers is defined as signed int
> (which could be stored to AutoVacuumShmem->av_maxParallelWorkers).
I understood, thank you.
> * 0001 patch:
>
> + /* Cannot release more workers than reserved */
> + Assert(nworkers <= av_nworkers_reserved);
>
> I think it's better to use Min() to cap the number of workers to be
> released by av_nworkers_reserved as Assert() won't work in release
> builds.
I agree.
> * 0004 patch:
>
> Can we write the same test cases while not relying on the 0002 patch
> (i.e., worker usage logging)? We check the worker usage log at two
> places in the regression tests. The idea is that we write the number
> of workers planned, reserved, and launched in DEBUG log level and
> check these logs in the regression tests. The patch 0001, 0003, and
> 0004 can be merged before push while we might want more discussion on
> the 0002 patch.
Possibly we can introduce a new injection point, or a new log for it.
But I assume that the subject of discussion in patch 0002 is the
"nreserved" logic, and "nlaunched/nplanned" logic does not raise any
questions.
I suggest splitting the 0002 patch into two parts : 1) basic logic and
2) additional logic with nreserved or something else. The second part can be
discussed in isolation from the patch set. If we do this, we may not have to
change the tests. What do you think?
Thank you for the review!
Please, see the updated set of patches.
I haven't touched patch 0002 yet, because I'd like to hear your opinion on
my suggestions above first.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v25-0003-Cost-based-parameters-propagation-for-parallel-a.patch (10.4K, 2-v25-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From e3bf8c1dbc5458ef8ec13b543c63dc731a742297 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v25 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 21 +++-
src/backend/commands/vacuumparallel.c | 157 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 1 +
5 files changed, 180 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 643849b2fb8..13304c40b59 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -54,6 +61,31 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -123,6 +155,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -225,6 +269,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments in the PVSharedCostParams for the details */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -236,6 +285,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -396,6 +446,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -540,6 +605,95 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
&wusage->cleanup);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the wokrer is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1109,6 +1263,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e1c995dd2ea..a0c020fa1a7 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1691,7 +1691,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1b1fb625cb2..4bfeba8264d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -434,6 +434,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0fb40f3c07f..aeb6bb869f0 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2071,6 +2071,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkersUsage
PVWorkersStats
PX_Alias
--
2.43.0
[text/x-patch] v25-0004-Tests-for-parallel-autovacuum.patch (19.9K, 3-v25-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 9b46e0ed6b7361d1b7c333723b9c75f0884fbac2 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v25 4/5] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 22 ++
src/backend/postmaster/autovacuum.c | 25 ++
src/include/postmaster/autovacuum.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 28 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../t/001_parallel_autovacuum.pl | 299 ++++++++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 +
.../modules/test_autovacuum/test_autovacuum.c | 31 ++
.../test_autovacuum/test_autovacuum.control | 3 +
13 files changed, 470 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 28624d3ba25..d51db0cf608 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -873,6 +874,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 13304c40b59..82618ab3ac5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -47,6 +47,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -653,6 +654,14 @@ parallel_vacuum_update_shared_delay_params(void)
VacuumUpdateCosts();
shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
/*
@@ -919,6 +928,19 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ *
+ * This injection point is also used to wait until parallel workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index a0c020fa1a7..7e994e88853 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3448,6 +3448,12 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+
+ elog(DEBUG2,
+ ngettext("autovacuum worker: %d parallel worker has been released",
+ "autovacuum worker: %d parallel workers has been released",
+ nworkers),
+ nworkers);
}
/*
@@ -3466,6 +3472,21 @@ AutoVacuumReleaseAllParallelWorkers(void)
Assert(av_nworkers_reserved == 0);
}
+/*
+ * Get number of free autovacuum parallel workers.
+ */
+int32
+AutoVacuumGetFreeParallelWorkers(void)
+{
+ int32 nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_SHARED);
+ nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
+ LWLockRelease(AutovacuumLock);
+
+ return nfree_workers;
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3634,5 +3655,9 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+ elog(DEBUG2,
+ "number of free parallel autovacuum workers is set to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index f3783afb51b..d60010a43b4 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
extern void AutoVacuumReleaseAllParallelWorkers(void);
+extern int32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4ac5c84db43..01fe0041c97 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e2b3eef4136..9dcdc68bc87 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..32254c53a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..969af8bd52a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..e5dacf59980
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,299 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ });
+
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+}
+
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$log_start = $node->wait_for_log(
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ $log_start
+);
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 20, 'All parallel workers has been released by the leader');
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# Test 3:
+# Test adjustment of free parallel workers number when changing
+# autovacuum_max_parallel_workers parameter
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 1;
+ SELECT pg_reload_conf();
+});
+
+# Since 2 parallel workers already launched and will be released in the future,
+# we are expecting that :
+# 1) number of free workers will be '0' after config reload
+# 2) number of free workers will be '1' after releasing workers
+
+# Check statement (1)
+$log_start = $node->wait_for_log(
+ qr/number of free parallel autovacuum workers is set to 0 due to config reload/,
+ $log_start
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until the end of parallel processing
+$log_start = $node->wait_for_log(
+ qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ $log_start
+);
+
+# Check statement (2)
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 1, 'Number of free parallel workers is consistent');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
+ SELECT pg_reload_conf();
+});
+
+# Test 4:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exits due to an ERROR.
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'error');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$log_start = $node->wait_for_log(
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/,
+ $log_start
+);
+
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker: 2 parallel workers has been released/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 5:
+# Same as above test, but simulate situation, when leader exits due to FATAL.
+
+prepare_for_next_test($node, 5);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until parallel workers are reserved autovacuum and kill the leader
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+my $av_pid = $node->safe_psql('postgres', qq{
+ SELECT pid FROM pg_stat_activity
+ WHERE backend_type = 'autovacuum worker'
+ AND wait_event = 'autovacuum-leader-before-indexes-processing'
+ LIMIT 1;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
+
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker: 2 parallel workers has been released/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..e5646e0def5
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting shared autovacuum state
+ */
+
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..dd5c839e851
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "postmaster/autovacuum.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ int32 nfree_workers;
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_INT32(nfree_workers);
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v25-0002-Logging-for-parallel-autovacuum.patch (10.2K, 4-v25-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 51f3c2856a2b583dc83c477cb03add67f363c058 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v25 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 54 ++++++++++++++++++++++++++-
src/backend/commands/vacuumparallel.c | 32 +++++++++++++---
src/include/commands/vacuum.h | 39 ++++++++++++++++++-
src/tools/pgindent/typedefs.list | 2 +
4 files changed, 117 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 82c5b28e0ad..28624d3ba25 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -343,6 +343,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -781,6 +788,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1135,42 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ if (AmAutoVacuumWorkerProcess())
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d reserved, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nreserved,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ else
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ }
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ if (AmAutoVacuumWorkerProcess())
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d reserved, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nreserved,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ else
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2669,7 +2717,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3103,7 +3152,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 806a7f48326..643849b2fb8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -228,7 +228,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -503,7 +503,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -514,7 +514,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -522,7 +523,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -534,7 +536,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -616,10 +619,13 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the statistics it stores will be updated according
+ * to what happens during function execution.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -656,13 +662,23 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
*/
if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ {
AutoVacuumReserveParallelWorkers(&nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nreserved += nworkers;
+ }
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -729,6 +745,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..1b1fb625cb2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,39 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ */
+typedef struct PVWorkersStats
+{
+ /* Number of parallel workers we are planned to launch */
+ int nplanned;
+
+ /*
+ * Number of parallel workers we have managed to reserve.
+ *
+ * Note, that we collect this stats only for the parallel *autovacuum*
+ * since during it we must reserve workers in shared state before actually
+ * trying to launch them (in order to meet the
+ * autovacuum_max_parallel_workers limit). Manual VACUUM (PARALLEL), on
+ * the contrary, doesn't need to reserve workers.
+ */
+ int nreserved;
+
+ /* Number of launched parallel workers */
+ int nlaunched;
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched, reserved
+ * and planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +427,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3da19d41413..0fb40f3c07f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2071,6 +2071,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v25-0001-Parallel-autovacuum.patch (19.4K, 5-v25-0001-Parallel-autovacuum.patch)
download | inline diff:
From 28f57af798fb8e218f2e02889b8c49c3a2baf0ec Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v25 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 8 +
11 files changed, 240 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..9459a010cc3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..806a7f48326 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
@@ -374,8 +377,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -554,12 +558,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -598,8 +607,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -647,6 +656,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -691,6 +707,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -739,6 +765,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseAllParallelWorkers();
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 695e187ba11..e1c995dd2ea 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -152,6 +152,13 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -286,6 +293,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -300,6 +309,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ int32 av_freeParallelWorkers;
+ int32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -362,6 +373,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -760,6 +772,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -776,6 +790,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1380,6 +1403,16 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -2276,6 +2309,12 @@ do_autovacuum(void)
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that all
+ * reserved workers are released even after FATAL error.
+ */
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Perform operations on collected tables.
*/
@@ -2457,6 +2496,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2857,8 +2902,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3335,6 +3384,88 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Reserves parallel workers for autovacuum.
+ *
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
+ *
+ * The caller must call AutoVacuumRelease[All]ParallelWorkers() to release the
+ * reserved workers.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader autovacuum worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* The worker must not have any reserved workers yet */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+}
+
+/*
+ * Releases the reserved parallel workers for autovacuum.
+ *
+ * This function should be used to release the parallel workers that an
+ * autovacuum worker reserved by AutoVacuumReserveParallelWorkers(). nworkers
+ * is the number of workers to release, which must not be greater than the
+ * number of workers currently reserved, av_nworkers_reserved.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Cannot release more workers than reserved */
+ nworkers = Min(nworkers, av_nworkers_reserved);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+}
+
+/*
+ * Same as above, but this function releases all the parallel workers that
+ * this autovacuum worker reserved.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3395,6 +3526,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_parallel_workers);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3476,3 +3611,28 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ int nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d77502838c4..4a5c73a9e33 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a5a0edf2534..c2395cf6638 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e686d88afc4..5e1c62d616c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -710,6 +710,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 199fc64ddf5..84f1040aed2 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..f3783afb51b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,6 +62,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..11dd3aebc6c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,14 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Target number of parallel autovacuum workers. -1 by default disables
+ * parallel vacuum during autovacuum. 0 means choose the parallel degree
+ * based on the number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v25-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 6-v25-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 930971e2a20455541cb8cdf5005a5b137d5f1fe4 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v25 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8cdd826fbd3..73f839b6a8d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9395,6 +9396,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-parallel-workers"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..4894de021cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v24--v25-diff-for-0004.patch (2.9K, 7-v24--v25-diff-for-0004.patch)
download | inline diff:
From deaf24759d8399f56d7e155758df7771f6ee628f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 11 Mar 2026 17:53:09 +0700
Subject: [PATCH] fixes for 0004
---
src/backend/postmaster/autovacuum.c | 25 +++++--------------
.../t/001_parallel_autovacuum.pl | 4 +--
2 files changed, 8 insertions(+), 21 deletions(-)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 869ca8f759b..7e994e88853 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1410,18 +1410,7 @@ avl_sigusr2_handler(SIGNAL_ARGS)
static void
autovacuum_worker_before_shmem_exit(int code, Datum arg)
{
- int nreserved_old = av_nworkers_reserved;
-
AutoVacuumReleaseAllParallelWorkers();
-
- if (nreserved_old > 0)
- {
- elog(DEBUG2,
- ngettext("autovacuum worker before_shmem_exit: %d parallel worker has been released",
- "autovacuum worker before_shmem_exit: %d parallel workers has been released",
- nreserved_old - av_nworkers_reserved),
- nreserved_old - av_nworkers_reserved);
- }
}
/*
@@ -2507,20 +2496,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
- int nreserved_workers = av_nworkers_reserved;
-
/*
* Parallel autovacuum can reserve parallel workers. Make sure
* that all reserved workers are released.
*/
AutoVacuumReleaseAllParallelWorkers();
- if (nreserved_workers > 0)
- ereport(DEBUG2,
- (errmsg("%d parallel autovacuum workers has been released after occured error",
- nreserved_workers),
- errhidecontext(true)));
-
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -3467,6 +3448,12 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+
+ elog(DEBUG2,
+ ngettext("autovacuum worker: %d parallel worker has been released",
+ "autovacuum worker: %d parallel workers has been released",
+ nworkers),
+ nworkers);
}
/*
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index 7f8b5a7b4d3..e5dacf59980 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -249,7 +249,7 @@ $log_start = $node->wait_for_log(
);
$log_start = $node->wait_for_log(
- qr/2 parallel autovacuum workers has been released after occured error/,
+ qr/autovacuum worker: 2 parallel workers has been released/,
$log_start
);
@@ -286,7 +286,7 @@ $node->safe_psql('postgres', qq{
});
$log_start = $node->wait_for_log(
- qr/autovacuum worker before_shmem_exit: 2 parallel workers has been released/,
+ qr/autovacuum worker: 2 parallel workers has been released/,
$log_start
);
--
2.43.0
[text/x-patch] v24--v25-diff-for-0001.patch (812B, 8-v24--v25-diff-for-0001.patch)
download | inline diff:
From 3723f43253bbf20bc0ac313d4f2d66065be629f0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 11 Mar 2026 16:18:15 +0700
Subject: [PATCH] fixes for 0001
---
src/backend/postmaster/autovacuum.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d403450d508..e1c995dd2ea 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3432,7 +3432,7 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
Assert(AmAutoVacuumWorkerProcess());
/* Cannot release more workers than reserved */
- Assert(nworkers <= av_nworkers_reserved);
+ nworkers = Min(nworkers, av_nworkers_reserved);
LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-11 19:05 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-11 19:05 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed, Mar 11, 2026 at 4:28 AM Daniil Davydov <[email protected]> wrote:
>
>
> > While I agree that showing only two numbers might lack some
> > information for users, I guess the same is true for
> > max_parallel_maintenance_workers or other parallel queries related to
> > GUC parameters. For instance, suppose we set
> > max_parallel_maintenance_workers to 2, if the table has (large enough)
> > 4 indexes, we would plan to execute a parallel vacuum with 2 workers
> > instead of 4 due to max_parallel_maintenance_worker shortage and it's
> > even possible that only 1 worker can launch due to
> > max_worker_processes shortage. In this case, we currently consider
> > that 2 workers are planned. Isn't it the same situation as the case
> > where we reserved 2 parallel vacuum workers for autovacuum for the
> > table with 4 indexes?
>
> I don't think that examples with other "max_parallel_" parameters will be
> appropriate, because these parameters are limiting the number of parallel
> workers for *single* operation/executor node/... . At the same time,
> av_max_parallel_workers limits the total number of parallel workers across
> all a/v leaders.
>
> Regarding the situation that you provided :
> The number of planned workers is reduced inside the
> parallel_vacuum_compute_workers due to the max_parallel_maintenance_workers
> limit. I.e. we cannot plan more workers than required by the config, and
> it's completely OK No one expects the number of "planned workers" to be more
> than max_parallel_maintenance_workers.
>
> IMO there is no need to make efforts to track the shortage of
> max_parallel_maintenance_workers for the VACUUM (PARALLEL), because this
> parameter just plays the role of a limiter. We will consider only the
> shortage of max_parallel_workers, that can be determined by looking at
> "planned vs. launched".
>
> And here is a difference with a parallel autovacuum :
> av_max_parallel_workers is considered twice : in the
> "parallel_vacuum_compute_workers" and "ReserveWorkers" functions.
> So the low number of launched workers can be explained by the shortage of
> both av_max_parallel_workers and max_parallel_workers. Since we want to
> distinguish between these cases, we have added the "nreserved" concept.
>
> I see that few modules can report something like "out of background worker
> slots" when they cannot launch more workers due to max_parallel_workers
> shortage (but modules depending on the "parallel.c" logic don't do so).
> This fact gave me another idea :
> If we don't want to log "nreserved" or some other similar value, maybe
> we should add logging after the "ReserveWorkers" function? I.e. if some
> workers cannot be reserved, we can emit a log like "out of parallel
> autovacuum workers. you should increase the av_max_parallel_workers
> parameter". Having this log can help the user distinguish between
> max_parallel_workers/av_max_parallel_workers shortage situations.
> What do you think?
My point is that the process of determining the number of workers
planned to launch is somewhat unclear to users in both cases. We
consider not only GUCs such as max_parallel_maintenance_workers but
also index AM definitions (i.e., amparallelvacuumoption) and index
sizes etc. But I agree that providing more detailed logs might help
users understand and notice the av_max_parallel_workers shortage.
BTW thes discussion made me think to change av_max_parallel_workers to
control the number of workers per-autovacuum worker instead (with
renaming it to say max_parallel_workers_per_autovacuum_worker). Users
can compute the maximum number of parallel workers the system requires
by (autovacuum_worker_slots *
max_parallel_workers_per_autovacuum_worker). We would no longer need
the reservation and release logic. I'd like to hear your opinion.
>
> Summary :
> 1)
> I think that we should not look at maintenance vacuum while
> considering how to inform the user about parameters shortage for autovacuum,
> because we have a more complicated situation in case of autovacuum.
> 2)
> I suggest adding a separate log that will be emitted every time we are
> unable to start workers due to a shortage of av_max_parallel_workers.
For (2), do you mean that the worker writes these logs regardless of
log_autovacuum_min_duration setting? I'm concerned that the server
logs would be flooded with these logs especially when multiple
autovacuum workers are working very actively and the system is facing
a shortage of av_max_parallel_workers.
>
> > * 0004 patch:
> >
> > Can we write the same test cases while not relying on the 0002 patch
> > (i.e., worker usage logging)? We check the worker usage log at two
> > places in the regression tests. The idea is that we write the number
> > of workers planned, reserved, and launched in DEBUG log level and
> > check these logs in the regression tests. The patch 0001, 0003, and
> > 0004 can be merged before push while we might want more discussion on
> > the 0002 patch.
>
> Possibly we can introduce a new injection point, or a new log for it.
> But I assume that the subject of discussion in patch 0002 is the
> "nreserved" logic, and "nlaunched/nplanned" logic does not raise any
> questions.
>
> I suggest splitting the 0002 patch into two parts : 1) basic logic and
> 2) additional logic with nreserved or something else. The second part can be
> discussed in isolation from the patch set. If we do this, we may not have to
> change the tests. What do you think?
Assuming the basic logic means nlaunched/nplanned logic, yes, it would
be a nice idea. I think user-facing logging stuff can be developed as
an improvement independent from the main parallel autovacuum patch.
It's ideal if we can implement the main patch (with tests) without
relying on the user-facing logging.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-16 12:33 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-16 12:33 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Thu, Mar 12, 2026 at 2:05 AM Masahiko Sawada <[email protected]> wrote:
>
> BTW thes discussion made me think to change av_max_parallel_workers to
> control the number of workers per-autovacuum worker instead (with
> renaming it to say max_parallel_workers_per_autovacuum_worker). Users
> can compute the maximum number of parallel workers the system requires
> by (autovacuum_worker_slots *
> max_parallel_workers_per_autovacuum_worker). We would no longer need
> the reservation and release logic. I'd like to hear your opinion.
>
IIUC, one of the main autovacuum's goals is to be "inconspicuous" for the
rest of the system. I mean that it should not try to vacuum all the tables
as fast as possible. Instead it should try to interfere with other backends
as little as possible and try to avoid high resource consumption (assuming
there is no hazard of wraparound).
I propose to reason based on the case for which the parallel a/v will
actually be used :
We have a 3 tables which has 80+ indexes each and require a
parallel a/v. Ideally, each of these tables should be processed with 20
parallel workers. This is a real example which can be encountered in
different productions, where such tables take up about half of all the data
in the database.
How parallel a/v will handle such a situation?
1. Our current implementation
We can set av_max_parallel_workers to 60 and autovacuum_parallel_workers
reloption to 20 for each table.
2. Proposed idea
We can set max_parallel_workers_per_av_worker to 20 and
autovacuum_parallel_workers reloption to 20 for each table.
In both cases we have guarantee that all tables will be processed with the
desired number of parallel workers. And both cases allows us to limit the
CPU consumption via reducing the "av_max_parallel_workers" parameter (for
current implementation) or via reducing the "autovacuum_parallel_workers"
reloption for each table (for proposed idea). So basically I don't see whether
current approach has a big advantages over the idea you proposed.
I also asked my friend, who is many years working with the clients with big
productions. He said that this is super important to process such huge tables
with maximum "intensity". I.e. each a/v worker should have ability to launch
as many parallel workers as required. I guess that this is an argument in
favor of your idea.
The only argument against this idea that I could come up with is that some
users may abuse our parallel a/v feature. For instance, the user can set
"autovacuum_parallel_workers" reloption not only for large tables, but also
for many smaller ones. In this case the max_parallel_workers_per_av_worker
must be pretty large (in order to process the huge table). Thus, the user
can face a situation when all a/v workers are launching additional parallel
workers => there is high CPU consumption and possibly max_parallel_workers
shortage. The only way to deal with it is to go through a large amount of
smaller tables and reduce "autovacuum_parallel_workers" reloption for each
of them. IMHO, this is a pretty unpleasant experience for the user. On the
other hand, the user himself is to blame for the occurrence of such a
situation.
Let's summarize.
Proposed idea has several strong advantages over current implementation.
The only disadvantage I came up with can be avoided by writing recommendations
on how to use this feature in the documentation. So, if I didn't messed up
anything and you don't have any doubts, I would rather implement the
proposed idea.
> > 2)
> > I suggest adding a separate log that will be emitted every time we are
> > unable to start workers due to a shortage of av_max_parallel_workers.
>
> For (2), do you mean that the worker writes these logs regardless of
> log_autovacuum_min_duration setting? I'm concerned that the server
> logs would be flooded with these logs especially when multiple
> autovacuum workers are working very actively and the system is facing
> a shortage of av_max_parallel_workers.
Oh, I didn't take that into account. But this is not a problem - we can
accumulate such statistics just as we do now for the "nreserved" ones. And
then we will log this value with all other stats.
> > Possibly we can introduce a new injection point, or a new log for it.
> > But I assume that the subject of discussion in patch 0002 is the
> > "nreserved" logic, and "nlaunched/nplanned" logic does not raise any
> > questions.
> >
> > I suggest splitting the 0002 patch into two parts : 1) basic logic and
> > 2) additional logic with nreserved or something else. The second part can be
> > discussed in isolation from the patch set. If we do this, we may not have to
> > change the tests. What do you think?
>
> Assuming the basic logic means nlaunched/nplanned logic, yes, it would
> be a nice idea. I think user-facing logging stuff can be developed as
> an improvement independent from the main parallel autovacuum patch.
> It's ideal if we can implement the main patch (with tests) without
> relying on the user-facing logging.
OK, actually we can do it.
Thank you very much for the review!
Please, see attached patches. The changes are :
1) Fixed segfault with accessing outdated pv_shared_cost_params pointer.
2) "Logging for autovacuum" is divided into two patches - basic logging
(nplanned/nlaunched) and advanced logging (nreserved).
3) Tests are now independent of logging.
By now I didn't try to change the core logic. I think that first we need to
agree on the use of the new GUC parameter.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v26-0006-Advanced-logging-for-parallel-autovacuum.patch (4.7K, 2-v26-0006-Advanced-logging-for-parallel-autovacuum.patch)
download | inline diff:
From 8b6fd4254165d8551a15f84b0e2973e134ce3938 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 19:09:01 +0700
Subject: [PATCH v26 6/6] Advanced logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 40 +++++++++++++++++++++------
src/backend/commands/vacuumparallel.c | 6 ++++
src/include/commands/vacuum.h | 15 ++++++++--
3 files changed, 51 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 4f97baced2b..df709afe622 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -791,8 +791,10 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->workers_usage.vacuum.nlaunched = 0;
vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.vacuum.nreserved = 0;
vacrel->workers_usage.cleanup.nlaunched = 0;
vacrel->workers_usage.cleanup.nplanned = 0;
+ vacrel->workers_usage.cleanup.nreserved = 0;
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
@@ -1146,17 +1148,39 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->lpdead_items);
if (vacrel->workers_usage.vacuum.nplanned > 0)
{
- appendStringInfo(&buf,
- _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
- vacrel->workers_usage.vacuum.nplanned,
- vacrel->workers_usage.vacuum.nlaunched);
+ if (AmAutoVacuumWorkerProcess())
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d reserved, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nreserved,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ else
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
}
if (vacrel->workers_usage.cleanup.nplanned > 0)
{
- appendStringInfo(&buf,
- _("parallel workers: index cleanup: %d planned, %d launched\n"),
- vacrel->workers_usage.cleanup.nplanned,
- vacrel->workers_usage.cleanup.nlaunched);
+ if (AmAutoVacuumWorkerProcess())
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d reserved, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nreserved,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
+ else
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
}
for (int i = 0; i < vacrel->nindexes; i++)
{
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 650060871d3..5105137ce3b 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -837,8 +837,14 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
* fewer workers than we requested.
*/
if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ {
AutoVacuumReserveParallelWorkers(&nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nreserved += nworkers;
+ }
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index cf0c3c9dbf7..4bfeba8264d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -308,13 +308,24 @@ typedef struct PVWorkersStats
/* Number of parallel workers we are planned to launch */
int nplanned;
+ /*
+ * Number of parallel workers we have managed to reserve.
+ *
+ * Note, that we collect this stats only for the parallel *autovacuum*
+ * since during it we must reserve workers in shared state before actually
+ * trying to launch them (in order to meet the
+ * autovacuum_max_parallel_workers limit). Manual VACUUM (PARALLEL), on
+ * the contrary, doesn't need to reserve workers.
+ */
+ int nreserved;
+
/* Number of launched parallel workers */
int nlaunched;
} PVWorkersStats;
/*
- * PVWorkersUsage stores information about total number of launched and
- * planned workers during parallel vacuum (both for vacuum and cleanup).
+ * PVWorkersUsage stores information about total number of launched, reserved
+ * and planned workers during parallel vacuum (both for vacuum and cleanup).
*/
typedef struct PVWorkersUsage
{
--
2.43.0
[text/x-patch] v26-0004-Tests-for-parallel-autovacuum.patch (19.9K, 3-v26-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From d6a096cfaddf01ef75786e35cabf1a009d652c69 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v26 4/6] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 22 ++
src/backend/postmaster/autovacuum.c | 25 ++
src/include/postmaster/autovacuum.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 28 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../t/001_parallel_autovacuum.pl | 299 ++++++++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 +
.../modules/test_autovacuum/test_autovacuum.c | 31 ++
.../test_autovacuum/test_autovacuum.control | 3 +
13 files changed, 470 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index cccaee5b620..4f97baced2b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -873,6 +874,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2cad6b15517..650060871d3 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -47,6 +47,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -656,6 +657,14 @@ parallel_vacuum_update_shared_delay_params(void)
VacuumUpdateCosts();
shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
/*
@@ -916,6 +925,19 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ *
+ * This injection point is also used to wait until parallel workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index a0c020fa1a7..7e994e88853 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3448,6 +3448,12 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+
+ elog(DEBUG2,
+ ngettext("autovacuum worker: %d parallel worker has been released",
+ "autovacuum worker: %d parallel workers has been released",
+ nworkers),
+ nworkers);
}
/*
@@ -3466,6 +3472,21 @@ AutoVacuumReleaseAllParallelWorkers(void)
Assert(av_nworkers_reserved == 0);
}
+/*
+ * Get number of free autovacuum parallel workers.
+ */
+int32
+AutoVacuumGetFreeParallelWorkers(void)
+{
+ int32 nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_SHARED);
+ nfree_workers = AutoVacuumShmem->av_freeParallelWorkers;
+ LWLockRelease(AutovacuumLock);
+
+ return nfree_workers;
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3634,5 +3655,9 @@ adjust_free_parallel_workers(int prev_max_parallel_workers)
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+ elog(DEBUG2,
+ "number of free parallel autovacuum workers is set to %u due to config reload",
+ AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index f3783afb51b..d60010a43b4 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,7 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
extern void AutoVacuumReserveParallelWorkers(int *nworkers);
extern void AutoVacuumReleaseParallelWorkers(int nworkers);
extern void AutoVacuumReleaseAllParallelWorkers(void);
+extern int32 AutoVacuumGetFreeParallelWorkers(void);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4ac5c84db43..01fe0041c97 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e2b3eef4136..9dcdc68bc87 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..32254c53a5d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..969af8bd52a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..34f11193dfd
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,299 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ });
+
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+}
+
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker: 2 parallel workers has been released/,
+ $log_start
+);
+
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 20, 'All parallel workers has been released by the leader');
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# Test 3:
+# Test adjustment of free parallel workers number when changing
+# autovacuum_max_parallel_workers parameter
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 1;
+ SELECT pg_reload_conf();
+});
+
+# Since 2 parallel workers already launched and will be released in the future,
+# we are expecting that :
+# 1) number of free workers will be '0' after config reload
+# 2) number of free workers will be '1' after releasing workers
+
+# Check statement (1)
+$log_start = $node->wait_for_log(
+ qr/number of free parallel autovacuum workers is set to 0 due to config reload/,
+ $log_start
+);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+# Wait until the end of parallel processing
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker: 2 parallel workers has been released/,
+ $log_start
+);
+
+# Check statement (2)
+$psql_out = $node->safe_psql('postgres', qq{
+ SELECT get_parallel_autovacuum_free_workers();
+});
+is($psql_out, 1, 'Number of free parallel workers is consistent');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+ ALTER SYSTEM SET autovacuum_max_parallel_workers = 10;
+ SELECT pg_reload_conf();
+});
+
+# Test 4:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exits due to an ERROR.
+
+prepare_for_next_test($node, 4);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'error');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$log_start = $node->wait_for_log(
+ qr/error triggered for injection point / .
+ qr/autovacuum-leader-before-indexes-processing/,
+ $log_start
+);
+
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker: 2 parallel workers has been released/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+# Test 5:
+# Same as above test, but simulate situation, when leader exits due to FATAL.
+
+prepare_for_next_test($node, 5);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until parallel workers are reserved autovacuum and kill the leader
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+my $av_pid = $node->safe_psql('postgres', qq{
+ SELECT pid FROM pg_stat_activity
+ WHERE backend_type = 'autovacuum worker'
+ AND wait_event = 'autovacuum-leader-before-indexes-processing'
+ LIMIT 1;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT pg_terminate_backend('$av_pid');
+});
+
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker: 2 parallel workers has been released/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..e5646e0def5
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting shared autovacuum state
+ */
+
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..dd5c839e851
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "postmaster/autovacuum.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ int32 nfree_workers;
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_INT32(nfree_workers);
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
[text/x-patch] v26-0002-Logging-for-parallel-autovacuum.patch (8.8K, 4-v26-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 51590d883e666192c650a5a4b88507ee4c54b76f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 19:01:05 +0700
Subject: [PATCH v26 2/6] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 32 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 26 +++++++++++++++++-----
src/include/commands/vacuum.h | 28 +++++++++++++++++++++--
src/tools/pgindent/typedefs.list | 2 ++
4 files changed, 78 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 82c5b28e0ad..cccaee5b620 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -343,6 +343,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -781,6 +788,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1135,20 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2669,7 +2695,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3103,7 +3130,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 806a7f48326..2d583435696 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -228,7 +228,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -503,7 +503,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -514,7 +514,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -522,7 +523,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -534,7 +536,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -616,10 +619,13 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the statistics it stores will be updated according
+ * to what happens during function execution.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -656,6 +662,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Reserve workers in autovacuum global state. Note that we may be given
* fewer workers than we requested.
@@ -729,6 +739,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..1d820915d71 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,28 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ */
+typedef struct PVWorkersStats
+{
+ /* Number of parallel workers we are planned to launch */
+ int nplanned;
+
+ /* Number of launched parallel workers */
+ int nlaunched;
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched and
+ * planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +416,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 52f8603a7be..a67d54e1819 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,6 +2088,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v26-0003-Cost-based-parameters-propagation-for-parallel-a.patch (11.0K, 5-v26-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From eb1449d0d4570059f1e271ecd81dac8c26a62db0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v26 3/6] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 21 +++-
src/backend/commands/vacuumparallel.c | 163 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 1 +
5 files changed, 186 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2d583435696..2cad6b15517 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -54,6 +61,31 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -123,6 +155,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -225,6 +269,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments in the PVSharedCostParams for the details */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -236,6 +285,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -396,6 +446,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -461,6 +526,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -540,6 +608,95 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
&wusage->cleanup);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the wokrer is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1103,6 +1260,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
@@ -1152,6 +1312,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e1c995dd2ea..a0c020fa1a7 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1691,7 +1691,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1d820915d71..cf0c3c9dbf7 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -423,6 +423,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a67d54e1819..15b8c966bf8 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,6 +2088,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkersUsage
PVWorkersStats
PX_Alias
--
2.43.0
[text/x-patch] v26-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 6-v26-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 89ee0b3f80678164c6bd4b9117c126fc15bd1328 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v26 5/6] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8cdd826fbd3..73f839b6a8d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9395,6 +9396,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-parallel-workers"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..4894de021cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v26-0001-Parallel-autovacuum.patch (19.4K, 7-v26-0001-Parallel-autovacuum.patch)
download | inline diff:
From db49a7776c6faba67a7c5f340f91da5c0e264e83 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v26 1/6] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 8 +
11 files changed, 240 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..9459a010cc3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..806a7f48326 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
@@ -374,8 +377,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -554,12 +558,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -598,8 +607,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -647,6 +656,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -691,6 +707,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -739,6 +765,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Release all the reserved parallel workers for autovacuum */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseAllParallelWorkers();
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 695e187ba11..e1c995dd2ea 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -152,6 +152,13 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Tracks the number of parallel workers currently reserved by the
+ * autovacuum worker. This is non-zero only for the parallel autovacuum
+ * leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -286,6 +293,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -300,6 +309,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ int32 av_freeParallelWorkers;
+ int32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -362,6 +373,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -760,6 +772,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -776,6 +790,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1380,6 +1403,16 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -2276,6 +2309,12 @@ do_autovacuum(void)
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that all
+ * reserved workers are released even after FATAL error.
+ */
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Perform operations on collected tables.
*/
@@ -2457,6 +2496,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2857,8 +2902,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3335,6 +3384,88 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Reserves parallel workers for autovacuum.
+ *
+ * nworkers is an in/out parameter; the requested number of parallel workers
+ * to reserve by the caller, and set to the actual number of reserved workers.
+ *
+ * The caller must call AutoVacuumRelease[All]ParallelWorkers() to release the
+ * reserved workers.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader autovacuum worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* The worker must not have any reserved workers yet */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers = Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+}
+
+/*
+ * Releases the reserved parallel workers for autovacuum.
+ *
+ * This function should be used to release the parallel workers that an
+ * autovacuum worker reserved by AutoVacuumReserveParallelWorkers(). nworkers
+ * is the number of workers to release, which must not be greater than the
+ * number of workers currently reserved, av_nworkers_reserved.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Cannot release more workers than reserved */
+ nworkers = Min(nworkers, av_nworkers_reserved);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+}
+
+/*
+ * Same as above, but this function releases all the parallel workers that
+ * this autovacuum worker reserved.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3395,6 +3526,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_parallel_workers);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3476,3 +3611,28 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Adjusts the number of free parallel workers corresponds to the new
+ * autovacuum_max_parallel_workers value.
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ int nfree_workers;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d77502838c4..4a5c73a9e33 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a5a0edf2534..c2395cf6638 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e686d88afc4..5e1c62d616c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -710,6 +710,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 5bdbf1530a2..29171efbc1b 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..f3783afb51b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,6 +62,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..11dd3aebc6c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,14 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Target number of parallel autovacuum workers. -1 by default disables
+ * parallel vacuum during autovacuum. 0 means choose the parallel degree
+ * based on the number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v25--v26-diff-for-0004.patch (1.3K, 8-v25--v26-diff-for-0004.patch)
download | inline diff:
From 0326bcc5f0d059ed4b4f1ec3ad48d9ce0d8405aa Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 18:57:04 +0700
Subject: [PATCH 2/2] fixes for 0004 patch
---
src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index e5dacf59980..34f11193dfd 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -109,7 +109,7 @@ $node->safe_psql('postgres', qq{
# Wait until the parallel autovacuum on table is completed. At the same time,
# we check that the required number of parallel workers has been started.
$log_start = $node->wait_for_log(
- qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ qr/autovacuum worker: 2 parallel workers has been released/,
$log_start
);
@@ -214,7 +214,7 @@ $node->safe_psql('postgres', qq{
# Wait until the end of parallel processing
$log_start = $node->wait_for_log(
- qr/parallel workers: index vacuum: 2 planned, 2 reserved, 2 launched/,
+ qr/autovacuum worker: 2 parallel workers has been released/,
$log_start
);
--
2.43.0
[text/x-patch] v25--v26-diff-for-0003.patch (1.1K, 9-v25--v26-diff-for-0003.patch)
download | inline diff:
From dde5f3806a70befb0d08d2c8afee423b27a02c7d Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 18:56:50 +0700
Subject: [PATCH 1/2] fixes for 0003 patch
---
src/backend/commands/vacuumparallel.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 82618ab3ac5..5105137ce3b 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -527,6 +527,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -1337,6 +1340,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-16 16:46 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-16 16:46 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, Mar 16, 2026 at 5:34 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Thu, Mar 12, 2026 at 2:05 AM Masahiko Sawada <[email protected]> wrote:
> >
> > BTW thes discussion made me think to change av_max_parallel_workers to
> > control the number of workers per-autovacuum worker instead (with
> > renaming it to say max_parallel_workers_per_autovacuum_worker). Users
> > can compute the maximum number of parallel workers the system requires
> > by (autovacuum_worker_slots *
> > max_parallel_workers_per_autovacuum_worker). We would no longer need
> > the reservation and release logic. I'd like to hear your opinion.
> >
>
> IIUC, one of the main autovacuum's goals is to be "inconspicuous" for the
> rest of the system. I mean that it should not try to vacuum all the tables
> as fast as possible. Instead it should try to interfere with other backends
> as little as possible and try to avoid high resource consumption (assuming
> there is no hazard of wraparound).
>
> I propose to reason based on the case for which the parallel a/v will
> actually be used :
> We have a 3 tables which has 80+ indexes each and require a
> parallel a/v. Ideally, each of these tables should be processed with 20
> parallel workers. This is a real example which can be encountered in
> different productions, where such tables take up about half of all the data
> in the database.
>
> How parallel a/v will handle such a situation?
> 1. Our current implementation
> We can set av_max_parallel_workers to 60 and autovacuum_parallel_workers
> reloption to 20 for each table.
> 2. Proposed idea
> We can set max_parallel_workers_per_av_worker to 20 and
> autovacuum_parallel_workers reloption to 20 for each table.
>
> In both cases we have guarantee that all tables will be processed with the
> desired number of parallel workers. And both cases allows us to limit the
> CPU consumption via reducing the "av_max_parallel_workers" parameter (for
> current implementation) or via reducing the "autovacuum_parallel_workers"
> reloption for each table (for proposed idea). So basically I don't see whether
> current approach has a big advantages over the idea you proposed.
>
> I also asked my friend, who is many years working with the clients with big
> productions. He said that this is super important to process such huge tables
> with maximum "intensity". I.e. each a/v worker should have ability to launch
> as many parallel workers as required. I guess that this is an argument in
> favor of your idea.
>
> The only argument against this idea that I could come up with is that some
> users may abuse our parallel a/v feature. For instance, the user can set
> "autovacuum_parallel_workers" reloption not only for large tables, but also
> for many smaller ones. In this case the max_parallel_workers_per_av_worker
> must be pretty large (in order to process the huge table). Thus, the user
> can face a situation when all a/v workers are launching additional parallel
> workers => there is high CPU consumption and possibly max_parallel_workers
> shortage. The only way to deal with it is to go through a large amount of
> smaller tables and reduce "autovacuum_parallel_workers" reloption for each
> of them. IMHO, this is a pretty unpleasant experience for the user. On the
> other hand, the user himself is to blame for the occurrence of such a
> situation.
>
> Let's summarize.
> Proposed idea has several strong advantages over current implementation.
> The only disadvantage I came up with can be avoided by writing recommendations
> on how to use this feature in the documentation. So, if I didn't messed up
> anything and you don't have any doubts, I would rather implement the
> proposed idea.
Thank you for the analysis on the new idea.
While both ideas can achieve our goal of this feature in general, the
new idea doesn't require an additional layer of reserve/release logic
on top of the existing bgworker pool, which is good. I've not tried
coding this idea but I believe the patch can be simplified very much.
So I agree to move to this idea.
>
> > > 2)
> > > I suggest adding a separate log that will be emitted every time we are
> > > unable to start workers due to a shortage of av_max_parallel_workers.
> >
> > For (2), do you mean that the worker writes these logs regardless of
> > log_autovacuum_min_duration setting? I'm concerned that the server
> > logs would be flooded with these logs especially when multiple
> > autovacuum workers are working very actively and the system is facing
> > a shortage of av_max_parallel_workers.
>
> Oh, I didn't take that into account. But this is not a problem - we can
> accumulate such statistics just as we do now for the "nreserved" ones. And
> then we will log this value with all other stats.
>
> > > Possibly we can introduce a new injection point, or a new log for it.
> > > But I assume that the subject of discussion in patch 0002 is the
> > > "nreserved" logic, and "nlaunched/nplanned" logic does not raise any
> > > questions.
> > >
> > > I suggest splitting the 0002 patch into two parts : 1) basic logic and
> > > 2) additional logic with nreserved or something else. The second part can be
> > > discussed in isolation from the patch set. If we do this, we may not have to
> > > change the tests. What do you think?
> >
> > Assuming the basic logic means nlaunched/nplanned logic, yes, it would
> > be a nice idea. I think user-facing logging stuff can be developed as
> > an improvement independent from the main parallel autovacuum patch.
> > It's ideal if we can implement the main patch (with tests) without
> > relying on the user-facing logging.
>
> OK, actually we can do it.
>
>
>
> Thank you very much for the review!
> Please, see attached patches. The changes are :
> 1) Fixed segfault with accessing outdated pv_shared_cost_params pointer.
> 2) "Logging for autovacuum" is divided into two patches - basic logging
> (nplanned/nlaunched) and advanced logging (nreserved).
> 3) Tests are now independent of logging.
Thank you for updating the patches. I'll wait for the new
implementation and will review the patches as soon as the patches are
updated.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-16 20:54 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-16 20:54 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Mon, Mar 16, 2026 at 11:46 PM Masahiko Sawada <[email protected]> wrote:
>
> While both ideas can achieve our goal of this feature in general, the
> new idea doesn't require an additional layer of reserve/release logic
> on top of the existing bgworker pool, which is good. I've not tried
> coding this idea but I believe the patch can be simplified very much.
> So I agree to move to this idea.
>
OK, let's do it!
Please, see an updated set of patches. Main changes are :
0001 patch - removed all logic related to the parallel workers reserving.
0002 patch - no changes regarding v26.
0003 patch - no changes regarding v26.
0004 patch - removed all stuff related to the "test_autovacuum" extension.
Also removed 3th, 4th and 5th tests, because they were related
only to the workers reserving logic.
0005 patch - minor changes reflecting the new GUC parameter's purpose.
I have maintained the independence of the tests from the user-facing logging.
Instead of "nworkers released" logs I have added a single log at the end of
one round of parallel processing :
"av worker: finished parallel index processing with N parallel workers".
This is the only code that I added rather than deleted within the 0001 patch.
I hope I didn't miss anything.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v27-0005-Documentation-for-parallel-autovacuum.patch (4.4K, 2-v27-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 923f6f3d758edb1f64eadef1f5bb1dfb873f4b21 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 03:23:38 +0700
Subject: [PATCH v27 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8cdd826fbd3..7741796c6b0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9395,6 +9396,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time by a single
+ autovacuum worker. Is capped by <xref linkend="guc-max-parallel-workers"/>.
+ The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..f2a280db569 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The number of parallel workers that can be taken from pool by a single
+ autovacuum worker is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..4894de021cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v27-0004-Tests-for-parallel-autovacuum.patch (11.3K, 3-v27-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 4219b2cf4869c3bab130642fb243441af26906ad Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:50:23 +0700
Subject: [PATCH v27 4/5] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 25 +++
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 +++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 169 ++++++++++++++++++
8 files changed, 242 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index cccaee5b620..4f97baced2b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -873,6 +874,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index f4fceb96874..89eaceba55c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -46,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -655,6 +656,14 @@ parallel_vacuum_update_shared_delay_params(void)
VacuumUpdateCosts();
shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
/*
@@ -898,6 +907,15 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * This injection point is used to wait until parallel autovacuum workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -918,6 +936,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ if (AmAutoVacuumWorkerProcess())
+ elog(DEBUG2,
+ ngettext("autovacuum worker: finished parallel index processing with %d parallel worker",
+ "autovacuum worker: finished parallel index processing with %d parallel workers",
+ nworkers),
+ nworkers);
}
/*
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4ac5c84db43..01fe0041c97 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e2b3eef4136..9dcdc68bc87 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..9ad87d48b96
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,169 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+}
+
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ autovacuum_max_parallel_workers = 4
+ log_min_messages = debug2
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker: finished parallel index processing with 2 parallel workers/,
+ $log_start
+);
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# We were able to get to this point, so everything is fine.
+ok(1);
+
+$node->stop;
+done_testing();
--
2.43.0
[text/x-patch] v27-0002-Logging-for-parallel-autovacuum.patch (8.8K, 4-v27-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 5bccf2adb52f57e1ab9ac0616ad2a52a7cf125cd Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 19:01:05 +0700
Subject: [PATCH v27 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 32 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 26 +++++++++++++++++-----
src/include/commands/vacuum.h | 28 +++++++++++++++++++++--
src/tools/pgindent/typedefs.list | 2 ++
4 files changed, 78 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 82c5b28e0ad..cccaee5b620 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -343,6 +343,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -781,6 +788,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1135,20 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2669,7 +2695,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3103,7 +3130,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index cafa0a4d494..5dea4374ec7 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -521,7 +522,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +535,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -615,10 +618,13 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the statistics it stores will be updated according
+ * to what happens during function execution.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -655,6 +661,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -711,6 +721,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..1d820915d71 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,28 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ */
+typedef struct PVWorkersStats
+{
+ /* Number of parallel workers we are planned to launch */
+ int nplanned;
+
+ /* Number of launched parallel workers */
+ int nlaunched;
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched and
+ * planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +416,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 52f8603a7be..a67d54e1819 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,6 +2088,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v27-0001-Parallel-autovacuum.patch (9.6K, 5-v27-0001-Parallel-autovacuum.patch)
download | inline diff:
From 041b867e07a61f5163ec35a1fb5fdd6fbe26b431 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v27 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++++++++++
src/backend/commands/vacuumparallel.c | 20 +++++++++++++------
src/backend/postmaster/autovacuum.c | 8 ++++++--
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 ++++++--
src/backend/utils/misc/guc_parameters.dat | 8 ++++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 8 ++++++++
10 files changed, 57 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..9459a010cc3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..cafa0a4d494 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -374,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -554,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -598,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 219673db930..f153d0343c8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2858,8 +2858,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d77502838c4..534e58a398c 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel autovacuum
+ * workers, and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a5a0edf2534..12393c1214b 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that a single autovacuum worker can take from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e686d88afc4..5e1c62d616c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -710,6 +710,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 5bdbf1530a2..29171efbc1b 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..11dd3aebc6c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,14 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Target number of parallel autovacuum workers. -1 by default disables
+ * parallel vacuum during autovacuum. 0 means choose the parallel degree
+ * based on the number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v27-0003-Cost-based-parameters-propagation-for-parallel-a.patch (11.0K, 6-v27-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From 9f21c0e081d8a36fd19acee75cfa47dbf74e19e2 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v27 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 21 +++-
src/backend/commands/vacuumparallel.c | 163 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 1 +
5 files changed, 186 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 5dea4374ec7..f4fceb96874 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -53,6 +60,31 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -122,6 +154,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -224,6 +268,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments in the PVSharedCostParams for the details */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -235,6 +284,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -395,6 +445,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -460,6 +525,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -539,6 +607,95 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
&wusage->cleanup);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the wokrer is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1081,6 +1238,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
@@ -1130,6 +1290,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f153d0343c8..f35acf3d75a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1659,7 +1659,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1d820915d71..cf0c3c9dbf7 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -423,6 +423,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a67d54e1819..15b8c966bf8 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,6 +2088,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkersUsage
PVWorkersStats
PX_Alias
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-17 16:50 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-17 16:50 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, Mar 16, 2026 at 1:54 PM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Mon, Mar 16, 2026 at 11:46 PM Masahiko Sawada <[email protected]> wrote:
> >
> > While both ideas can achieve our goal of this feature in general, the
> > new idea doesn't require an additional layer of reserve/release logic
> > on top of the existing bgworker pool, which is good. I've not tried
> > coding this idea but I believe the patch can be simplified very much.
> > So I agree to move to this idea.
> >
>
> OK, let's do it!
>
> Please, see an updated set of patches. Main changes are :
> 0001 patch - removed all logic related to the parallel workers reserving.
> 0002 patch - no changes regarding v26.
> 0003 patch - no changes regarding v26.
> 0004 patch - removed all stuff related to the "test_autovacuum" extension.
> Also removed 3th, 4th and 5th tests, because they were related
> only to the workers reserving logic.
> 0005 patch - minor changes reflecting the new GUC parameter's purpose.
>
> I have maintained the independence of the tests from the user-facing logging.
> Instead of "nworkers released" logs I have added a single log at the end of
> one round of parallel processing :
> "av worker: finished parallel index processing with N parallel workers".
> This is the only code that I added rather than deleted within the 0001 patch.
>
> I hope I didn't miss anything.
Thank you for updating the patch!
I find the current behavior of the autovacuum_parallel_workers storage
parameter somewhat unintuitive for users. The documentation currently
states:
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
It is quite confusing that setting the value to 0 does not actually
disable the parallel vacuum. In many other PostgreSQL parameters, 0
typically means "off" or "no workers." I think that this parameter
should behave as follows:
-1: Use the value of autovacuum_max_parallel_workers (GUC) as the
limit (fallback).
>=0: Use the specified value as the limit, capped by autovacuum_max_parallel_workers. (Specifically, setting this to 0 would disable parallel vacuum for the table).
Currently, the patch implements parallel autovacuum as an "opt-in"
style. That is, even after setting the GUC to >0, users must manually
set the storage parameter for each table. This assumes that users
already know exactly which tables need parallel vacuum.
However, I believe it would be more intuitive to let the system decide
which tables are eligible for parallel vacuum based on index size and
count (via min_parallel_index_scan_size, etc.), rather than forcing
manual per-table configuration. Therefore, I'm thinking we might want
to make it "opt-out" style by default instead:
- Set the default value of the storage parameter to -1 (i.e., fallback to GUC).
- the default value of the GUC autovacuum_max_parallel_workers at 0.
With this configuration:
- Parallel autovacuum is disabled by default.
- Users can enable it globally by simply setting the GUC to >0.
- Users can still disable it for specific tables by setting the
storage parameter to 0.
What do you think?
* 0001 patch
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context
=> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that a single
autovacuum worker can take from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
How about rephrasing the short description to "Maximum number of
parallel processes per autovacuum operation."?
The maximum value should be MAX_PARALLEL_WORKER_LIMIT.
---
* 0002 patch:
I think that it's better to rename PVWorkersStats and PVWorkersUsage
to PVWorkerStats and PVWorkerUsage (making Worker singular).
I've attached the patch for minor fixes including the above comments.
---
* 0004 patch:
+ if (AmAutoVacuumWorkerProcess())
+ elog(DEBUG2,
+ ngettext("autovacuum worker: finished
parallel index processing with %d parallel worker",
+ "autovacuum worker:
finished parallel index processing with %d parallel workers",
+ nworkers),
+ nworkers);
Now that having planned and launched logs in autovacuum logs is
straightforward, let's use these logs in the tests instead and make it
the first patch. We can apply it independently.
---
We check only the server logs throughout the new tap tests. I think we
should also confirm that the autovacuum successfully completes. I've
attached the proposed change to the tap tests.
The attached 0003 and 0006 patches are fixup changes on top v27. Other
patches don't have any change from the previous v27 patch set.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[application/x-patch] v28-0006-fixup-updates-tap-tests.patch (6.6K, 2-v28-0006-fixup-updates-tap-tests.patch)
download | inline diff:
From 90769faf15b6d830cafb42380b9ec5f5ff974913 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Mon, 16 Mar 2026 18:01:45 -0700
Subject: [PATCH v28 6/7] fixup: updates tap tests.
---
src/backend/commands/vacuumparallel.c | 9 +--
.../t/001_parallel_autovacuum.pl | 57 +++++++++++--------
2 files changed, 33 insertions(+), 33 deletions(-)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ef36b9bd286..62b6f50b538 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -656,7 +656,7 @@ parallel_vacuum_update_shared_delay_params(void)
shared_params_generation_local = params_generation;
elog(DEBUG2,
- "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
vacuum_cost_limit,
vacuum_cost_delay,
VacuumCostPageMiss,
@@ -933,13 +933,6 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
-
- if (AmAutoVacuumWorkerProcess())
- elog(DEBUG2,
- ngettext("autovacuum worker: finished parallel index processing with %d parallel worker",
- "autovacuum worker: finished parallel index processing with %d parallel workers",
- nworkers),
- nworkers);
}
/*
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index 9ad87d48b96..0147ee33c93 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -11,8 +11,8 @@ if ($ENV{enable_injection_points} ne 'yes')
}
# Before each test we should disable autovacuum for 'test_autovac' table and
-# generate some dead tuples in it.
-
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table tset_autovac.
sub prepare_for_next_test
{
my ($node, $test_number) = @_;
@@ -21,12 +21,25 @@ sub prepare_for_next_test
ALTER TABLE test_autovac SET (autovacuum_enabled = false);
UPDATE test_autovac SET col_1 = $test_number;
});
+
+ my $count = $node->safe_psql('postgres',
+ qq{SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'});
+
+ return $count;
}
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until('postgres',
+ qq{SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'});
+}
my $psql_out;
-my $node = PostgreSQL::Test::Cluster->new('node1');
+my $node = PostgreSQL::Test::Cluster->new('main');
$node->init;
# Configure postgres, so it can launch parallel autovacuum workers, log all
@@ -54,7 +67,7 @@ $node->safe_psql('postgres', qq{
CREATE EXTENSION injection_points;
});
-my $indexes_num = 4;
+my $indexes_num = 3;
my $initial_rows_num = 10_000;
my $autovacuum_parallel_workers = 2;
@@ -91,7 +104,8 @@ $node->safe_psql('postgres', qq{
# Our table has enough indexes and appropriate reloptions, so autovacuum must
# be able to process it in parallel mode. Just check if it can do it.
-prepare_for_next_test($node, 1);
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
$node->safe_psql('postgres', qq{
ALTER TABLE test_autovac SET (autovacuum_enabled = true);
@@ -99,16 +113,16 @@ $node->safe_psql('postgres', qq{
# Wait until the parallel autovacuum on table is completed. At the same time,
# we check that the required number of parallel workers has been started.
-$log_start = $node->wait_for_log(
- qr/autovacuum worker: finished parallel index processing with 2 parallel workers/,
- $log_start
-);
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
# Test 2:
# Check whether parallel autovacuum leader can propagate cost-based parameters
# to the parallel workers.
-prepare_for_next_test($node, 2);
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
$node->safe_psql('postgres', qq{
SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
@@ -123,8 +137,7 @@ $node->wait_for_event(
'autovacuum-start-parallel-vacuum'
);
-# Reload config - leader worker must update its own parameters during indexes
-# processing
+# Update the shared cost-based delay parameters.
$node->safe_psql('postgres', qq{
ALTER SYSTEM SET vacuum_cost_limit = 500;
ALTER SYSTEM SET vacuum_cost_page_miss = 10;
@@ -133,12 +146,12 @@ $node->safe_psql('postgres', qq{
SELECT pg_reload_conf();
});
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
$node->safe_psql('postgres', qq{
SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
});
-
-# Now wait until parallel autovacuum leader completes processing table (i.e.
-# guaranteed to call vacuum_delay_point) and launches parallel worker.
$node->wait_for_event(
'autovacuum worker',
'autovacuum-leader-before-indexes-processing'
@@ -146,24 +159,18 @@ $node->wait_for_event(
# Check whether parallel worker successfully updated all parameters during
# index processing
-$log_start = $node->wait_for_log(
- qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
- qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
- $log_start
-);
+$node->wait_for_log(qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
# Cleanup
$node->safe_psql('postgres', qq{
- SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
-
SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
- ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
});
-# We were able to get to this point, so everything is fine.
-ok(1);
+wait_for_autovacuum_complete($node, $av_count);
$node->stop;
done_testing();
--
2.53.0
[application/x-patch] v28-0007-Documentation-for-parallel-autovacuum.patch (4.4K, 3-v28-0007-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From 40f9605dd52f5079a641456cbb5e79c8a2bed136 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 03:23:38 +0700
Subject: [PATCH v28 7/7] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8cdd826fbd3..7741796c6b0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9395,6 +9396,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time by a single
+ autovacuum worker. Is capped by <xref linkend="guc-max-parallel-workers"/>.
+ The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..f2a280db569 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The number of parallel workers that can be taken from pool by a single
+ autovacuum worker is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..4894de021cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.53.0
[application/x-patch] v28-0005-Tests-for-parallel-autovacuum.patch (11.3K, 4-v28-0005-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From ee17cd8332a67d15e4f5fc0bdb4ec367092d1fc5 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:50:23 +0700
Subject: [PATCH v28 5/7] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 25 +++
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 +++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 169 ++++++++++++++++++
8 files changed, 242 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c57432670e7..8d2980f3ef0 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -873,6 +874,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 98aeb66eec4..ef36b9bd286 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -46,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -653,6 +654,14 @@ parallel_vacuum_update_shared_delay_params(void)
VacuumUpdateCosts();
shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
/*
@@ -895,6 +904,15 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * This injection point is used to wait until parallel autovacuum workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -915,6 +933,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ if (AmAutoVacuumWorkerProcess())
+ elog(DEBUG2,
+ ngettext("autovacuum worker: finished parallel index processing with %d parallel worker",
+ "autovacuum worker: finished parallel index processing with %d parallel workers",
+ nworkers),
+ nworkers);
}
/*
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4ac5c84db43..01fe0041c97 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e2b3eef4136..9dcdc68bc87 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..9ad87d48b96
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,169 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it.
+
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+}
+
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ autovacuum_max_parallel_workers = 4
+ log_min_messages = debug2
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+prepare_for_next_test($node, 1);
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$log_start = $node->wait_for_log(
+ qr/autovacuum worker: finished parallel index processing with 2 parallel workers/,
+ $log_start
+);
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+prepare_for_next_test($node, 2);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Reload config - leader worker must update its own parameters during indexes
+# processing
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Now wait until parallel autovacuum leader completes processing table (i.e.
+# guaranteed to call vacuum_delay_point) and launches parallel worker.
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$log_start = $node->wait_for_log(
+ qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
+ qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_start
+);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
+});
+
+# We were able to get to this point, so everything is fine.
+ok(1);
+
+$node->stop;
+done_testing();
--
2.53.0
[application/x-patch] v28-0003-fixup-minor-changes-for-logging-worker-usage.patch (9.5K, 5-v28-0003-fixup-minor-changes-for-logging-worker-usage.patch)
download | inline diff:
From 5eb904426a57b88cb7789098bb1bf983dcde1d56 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Mon, 16 Mar 2026 15:09:26 -0700
Subject: [PATCH v28 3/7] fixup: minor changes for logging worker usage.
---
src/backend/access/heap/vacuumlazy.c | 35 +++++++++++++--------------
src/backend/commands/vacuumparallel.c | 21 +++++++---------
src/include/commands/vacuum.h | 26 ++++++++++----------
src/tools/pgindent/typedefs.list | 4 +--
4 files changed, 41 insertions(+), 45 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index cccaee5b620..c57432670e7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -346,9 +346,9 @@ typedef struct LVRelState
/*
* Total number of planned and actually launched parallel workers for
- * index scans.
+ * index vacuuming and index cleanup.
*/
- PVWorkersUsage workers_usage;
+ PVWorkerUsage worker_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
@@ -788,10 +788,10 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
- vacrel->workers_usage.vacuum.nlaunched = 0;
- vacrel->workers_usage.vacuum.nplanned = 0;
- vacrel->workers_usage.cleanup.nlaunched = 0;
- vacrel->workers_usage.cleanup.nplanned = 0;
+ vacrel->worker_usage.vacuum.nlaunched = 0;
+ vacrel->worker_usage.vacuum.nplanned = 0;
+ vacrel->worker_usage.cleanup.nlaunched = 0;
+ vacrel->worker_usage.cleanup.nplanned = 0;
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
@@ -1135,20 +1135,19 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
- if (vacrel->workers_usage.vacuum.nplanned > 0)
- {
+
+ if (vacrel->worker_usage.vacuum.nplanned > 0)
appendStringInfo(&buf,
_("parallel workers: index vacuum: %d planned, %d launched in total\n"),
- vacrel->workers_usage.vacuum.nplanned,
- vacrel->workers_usage.vacuum.nlaunched);
- }
- if (vacrel->workers_usage.cleanup.nplanned > 0)
- {
+ vacrel->worker_usage.vacuum.nplanned,
+ vacrel->worker_usage.vacuum.nlaunched);
+
+ if (vacrel->worker_usage.cleanup.nplanned > 0)
appendStringInfo(&buf,
_("parallel workers: index cleanup: %d planned, %d launched\n"),
- vacrel->workers_usage.cleanup.nplanned,
- vacrel->workers_usage.cleanup.nlaunched);
- }
+ vacrel->worker_usage.cleanup.nplanned,
+ vacrel->worker_usage.cleanup.nlaunched);
+
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2696,7 +2695,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
vacrel->num_index_scans,
- &vacrel->workers_usage);
+ &(vacrel->worker_usage.vacuum));
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3131,7 +3130,7 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
estimated_count,
- &vacrel->workers_usage);
+ &(vacrel->worker_usage.cleanup));
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 5dea4374ec7..b7ffd854009 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum, PVWorkersStats *wstats);
+ bool vacuum, PVWorkerStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, PVWorkersUsage *wusage)
+ int num_index_scans, PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -513,8 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
- &wusage->vacuum);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wstats);
}
/*
@@ -523,7 +522,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
int num_index_scans, bool estimated_count,
- PVWorkersUsage *wusage)
+ PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -535,8 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
- &wusage->cleanup);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
/*
@@ -619,12 +617,11 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
*
- * If wstats is not NULL, the statistics it stores will be updated according
- * to what happens during function execution.
+ * If wstats is not NULL, the parallel worker statistics are updated.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum, PVWorkersStats *wstats)
+ bool vacuum, PVWorkerStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -661,7 +658,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
- /* Remember this value, if we asked to */
+ /* Update the statistics, if we asked to */
if (wstats != NULL && nworkers > 0)
wstats->nplanned += nworkers;
@@ -722,7 +719,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
- /* Remember this value, if we asked to */
+ /* Update the statistics, if we asked to */
if (wstats != NULL)
wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1d820915d71..953a506181e 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -301,26 +301,26 @@ typedef struct VacDeadItemsInfo
} VacDeadItemsInfo;
/*
- * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ * Statistics for parallel vacuum workers (planned vs. actual)
*/
-typedef struct PVWorkersStats
+typedef struct PVWorkerStats
{
- /* Number of parallel workers we are planned to launch */
+ /* Number of parallel workers planned to launch */
int nplanned;
- /* Number of launched parallel workers */
+ /* Number of parallel workers that were successfully launched */
int nlaunched;
-} PVWorkersStats;
+} PVWorkerStats;
/*
- * PVWorkersUsage stores information about total number of launched and
- * planned workers during parallel vacuum (both for vacuum and cleanup).
+ * PVWorkerUsage stores information about total number of launched and
+ * planned workers during parallel vacuum (both for index vacuum and cleanup).
*/
-typedef struct PVWorkersUsage
+typedef struct PVWorkerUsage
{
- PVWorkersStats vacuum;
- PVWorkersStats cleanup;
-} PVWorkersUsage;
+ PVWorkerStats vacuum;
+ PVWorkerStats cleanup;
+} PVWorkerUsage;
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
@@ -417,12 +417,12 @@ extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- PVWorkersUsage *wusage);
+ PVWorkerStats *wstats);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
bool estimated_count,
- PVWorkersUsage *wusage);
+ PVWorkerStats *wstats);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a67d54e1819..4c230ee38ca 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,8 +2088,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
-PVWorkersUsage
-PVWorkersStats
+PVWorkerUsage
+PVWorkerStats
PX_Alias
PX_Cipher
PX_Combo
--
2.53.0
[application/x-patch] v28-0004-Cost-based-parameters-propagation-for-parallel-a.patch (11.0K, 6-v28-0004-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From dd3dbd5a43d0b05f3369e003cdb3aa6da9487bb3 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v28 4/7] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 21 +++-
src/backend/commands/vacuumparallel.c | 163 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 1 +
5 files changed, 186 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index b7ffd854009..98aeb66eec4 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -53,6 +60,31 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -122,6 +154,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -224,6 +268,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments in the PVSharedCostParams for the details */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -235,6 +284,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -395,6 +445,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -460,6 +525,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -537,6 +605,95 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the wokrer is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1078,6 +1235,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
@@ -1127,6 +1287,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f153d0343c8..f35acf3d75a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1659,7 +1659,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 953a506181e..cc154737115 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -423,6 +423,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4c230ee38ca..ca99953df27 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,6 +2088,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.53.0
[application/x-patch] v28-0002-Logging-for-parallel-autovacuum.patch (8.8K, 7-v28-0002-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 8ebcfb96ce2a6daad3adcc5f5bcfd0b03e029b88 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 19:01:05 +0700
Subject: [PATCH v28 2/7] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 32 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 26 +++++++++++++++++-----
src/include/commands/vacuum.h | 28 +++++++++++++++++++++--
src/tools/pgindent/typedefs.list | 2 ++
4 files changed, 78 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 82c5b28e0ad..cccaee5b620 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -343,6 +343,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index scans.
+ */
+ PVWorkersUsage workers_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -781,6 +788,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->workers_usage.vacuum.nlaunched = 0;
+ vacrel->workers_usage.vacuum.nplanned = 0;
+ vacrel->workers_usage.cleanup.nlaunched = 0;
+ vacrel->workers_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1135,20 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage.vacuum.nplanned > 0)
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->workers_usage.vacuum.nplanned,
+ vacrel->workers_usage.vacuum.nlaunched);
+ }
+ if (vacrel->workers_usage.cleanup.nplanned > 0)
+ {
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->workers_usage.cleanup.nplanned,
+ vacrel->workers_usage.cleanup.nlaunched);
+ }
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2669,7 +2695,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3103,7 +3130,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index cafa0a4d494..5dea4374ec7 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
+ &wusage->vacuum);
}
/*
@@ -521,7 +522,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +535,8 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
+ &wusage->cleanup);
}
/*
@@ -615,10 +618,13 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the statistics it stores will be updated according
+ * to what happens during function execution.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -655,6 +661,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Remember this value, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -711,6 +721,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Remember this value, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..1d820915d71 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,28 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ */
+typedef struct PVWorkersStats
+{
+ /* Number of parallel workers we are planned to launch */
+ int nplanned;
+
+ /* Number of launched parallel workers */
+ int nlaunched;
+} PVWorkersStats;
+
+/*
+ * PVWorkersUsage stores information about total number of launched and
+ * planned workers during parallel vacuum (both for vacuum and cleanup).
+ */
+typedef struct PVWorkersUsage
+{
+ PVWorkersStats vacuum;
+ PVWorkersStats cleanup;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +416,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 52f8603a7be..a67d54e1819 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,6 +2088,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkersUsage
+PVWorkersStats
PX_Alias
PX_Cipher
PX_Combo
--
2.53.0
[application/x-patch] v28-0001-Parallel-autovacuum.patch (9.6K, 8-v28-0001-Parallel-autovacuum.patch)
download | inline diff:
From 88e4cdbf82e03032e3bb0c747abea87a00a7ed75 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v28 1/7] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++++++++++
src/backend/commands/vacuumparallel.c | 20 +++++++++++++------
src/backend/postmaster/autovacuum.c | 8 ++++++--
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 ++++++--
src/backend/utils/misc/guc_parameters.dat | 8 ++++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 8 ++++++++
10 files changed, 57 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..9459a010cc3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..cafa0a4d494 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -374,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -554,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -598,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 219673db930..f153d0343c8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2858,8 +2858,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d77502838c4..534e58a398c 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel autovacuum
+ * workers, and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a5a0edf2534..12393c1214b 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that a single autovacuum worker can take from bgworkers pool.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e686d88afc4..5e1c62d616c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -710,6 +710,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 5bdbf1530a2..29171efbc1b 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..11dd3aebc6c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,14 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Target number of parallel autovacuum workers. -1 by default disables
+ * parallel vacuum during autovacuum. 0 means choose the parallel degree
+ * based on the number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.53.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-18 09:23 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-18 09:23 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Mar 17, 2026 at 11:51 PM Masahiko Sawada <[email protected]> wrote:
>
> I find the current behavior of the autovacuum_parallel_workers storage
> parameter somewhat unintuitive for users. The documentation currently
> states:
>
> + <para>
> + Sets the maximum number of parallel autovacuum workers that can process
> + indexes of this table.
> + The default value is -1, which means no parallel index vacuuming for
> + this table. If value is 0 then parallel degree will computed based on
> + number of indexes.
> + Note that the computed number of workers may not actually be available at
> + run time. If this occurs, the autovacuum will run with fewer workers
> + than expected.
> + </para>
>
> It is quite confusing that setting the value to 0 does not actually
> disable the parallel vacuum. In many other PostgreSQL parameters, 0
> typically means "off" or "no workers." I think that this parameter
> should behave as follows:
>
> -1: Use the value of autovacuum_max_parallel_workers (GUC) as the
> limit (fallback).
> >=0: Use the specified value as the limit, capped by autovacuum_max_parallel_workers. (Specifically, setting this to 0 would disable parallel vacuum for the table).
>
Actually we have several places in the code where "-1" means disabled and "0"
means choosing a parallel degree based on the number of indexes. Since this
is an inner logic, I agree that we should make our parameter more intuitive
to the user. But this will make the code a bit confusing.
> Currently, the patch implements parallel autovacuum as an "opt-in"
> style. That is, even after setting the GUC to >0, users must manually
> set the storage parameter for each table. This assumes that users
> already know exactly which tables need parallel vacuum.
>
> However, I believe it would be more intuitive to let the system decide
> which tables are eligible for parallel vacuum based on index size and
> count (via min_parallel_index_scan_size, etc.), rather than forcing
> manual per-table configuration. Therefore, I'm thinking we might want
> to make it "opt-out" style by default instead:
>
> - Set the default value of the storage parameter to -1 (i.e., fallback to GUC).
> - the default value of the GUC autovacuum_max_parallel_workers at 0.
>
> With this configuration:
>
> - Parallel autovacuum is disabled by default.
> - Users can enable it globally by simply setting the GUC to >0.
> - Users can still disable it for specific tables by setting the
> storage parameter to 0.
>
> What do you think?
I'm afraid that I can't agree with you here. As I wrote above [1], the
parallel a/v feature will be useful when a user has a few huge tables with
a big amount of indexes. Only these tables require parallel processing and a
user knows about it.
If we implement the feature as you suggested, then after setting the
av_max_parallel_workers to N > 0, the user will have to manually disable
processing for all tables except the largest ones. This will need to be done
to ensure that parallel workers are launched specifically to process the
largest tables and not wasting on the processing of little ones.
I.e. I'm proposing a design that will require manual actions to *enable*
parallel a/v for several large tables rather than *disable* it for all of
the rest tables in the cluster. I'm sure that's what users want.
Allowing the system to decide which tables to process in parallel is a good
way from a design perspective. But I'm thinking of the following example :
Imagine that we have a threshold, when exceeded, parallel a/v is used.
Several a/v workers encounter tables which exceed this threshold by 1_000 and
each of these workers decides to launch a few parallel workers. Another a/v
worker encounters a table which is beyond this threshold by 1_000_000 and
tries to launch N parallel workers, but facing the max_parallel_workers
shortage. Thus, processing of this table will take a very long time to
complete due to lack of resources. The only way for users to avoid it is to
disable parallel a/v for all tables, which exceeds the threshold and are not
of particular interest.
I cannot imagine how our heuristics can handle such situations. IMHO the
situation will come down to the fact that users will manually disable
parallel a/v for a big amount of tables. I guess it can be pretty frustrating.
What do you think?
>
> +{ name => 'autovacuum_max_parallel_workers', type => 'int', context
> => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> + short_desc => 'Maximum number of parallel workers that a single
> autovacuum worker can take from bgworkers pool.',
> + variable => 'autovacuum_max_parallel_workers',
> + boot_val => '2',
> + min => '0',
> + max => 'MAX_BACKENDS',
> +},
>
> How about rephrasing the short description to "Maximum number of
> parallel processes per autovacuum operation."?
I'm not sure if this phrase will be understandable to the user.
I don't see any places where we would define the "autovacuum operation"
concept, so I suppose it could be ambiguous. What about "Maximum number of
parallel processes per autovacuuming of one table"?
>
> The maximum value should be MAX_PARALLEL_WORKER_LIMIT.
>
Sure!
>
> I think that it's better to rename PVWorkersStats and PVWorkersUsage
> to PVWorkerStats and PVWorkerUsage (making Worker singular).
>
> I've attached the patch for minor fixes including the above comments.
>
I agree with all proposed fixes. Thank you!
>
> + if (AmAutoVacuumWorkerProcess())
> + elog(DEBUG2,
> + ngettext("autovacuum worker: finished
> parallel index processing with %d parallel worker",
> + "autovacuum worker:
> finished parallel index processing with %d parallel workers",
> + nworkers),
> + nworkers);
>
> Now that having planned and launched logs in autovacuum logs is
> straightforward, let's use these logs in the tests instead and make it
> the first patch. We can apply it independently.
>
OK, I agree.
> We check only the server logs throughout the new tap tests. I think we
> should also confirm that the autovacuum successfully completes. I've
> attached the proposed change to the tap tests.
>
I agree with proposed changes. BTW, don't we need to reduce the strings
length to 80 characters in the tests? In some tests, this rule is followed,
and in some it is not.
--
Thank you very much for the review and proposed patches!
Please, see an updated set of patches. Note that the "logging for autovacuum"
is considered as the first patch now.
[1] https://www.postgresql.org/message-id/CAJDiXghaazbrQMZZS08d9Ffh2y4w05TgH9dpBhqChv1qNTp%2BxA%40mail.g...
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v29-0005-Documentation-for-parallel-autovacuum.patch (4.5K, 2-v29-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From a13a5b269ac51bfba66354123a8be8b0ef5cf64a Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 03:23:38 +0700
Subject: [PATCH v29 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 21 +++++++++++++++++++++
3 files changed, 51 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8cdd826fbd3..7741796c6b0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9395,6 +9396,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time by a single
+ autovacuum worker. Is capped by <xref linkend="guc-max-parallel-workers"/>.
+ The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..f2a280db569 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The number of parallel workers that can be taken from pool by a single
+ autovacuum worker is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..e367310a571 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,27 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is 0, which means no parallel index vacuuming for
+ this table. If value is -1 then parallel degree will computed based on
+ number of indexes and limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ parameter.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v29-0003-Cost-based-parameters-propagation-for-parallel-a.patch (11.0K, 3-v29-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From cbe0ea08d700f141a50717283b287457961f3eb3 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v29 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 21 +++-
src/backend/commands/vacuumparallel.c | 163 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 1 +
5 files changed, 186 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index b7ffd854009..98aeb66eec4 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -53,6 +60,31 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -122,6 +154,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -224,6 +268,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments in the PVSharedCostParams for the details */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -235,6 +284,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -395,6 +445,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -460,6 +525,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -537,6 +605,95 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the wokrer is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1078,6 +1235,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
@@ -1127,6 +1287,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ff57d8fca2a..adccfa06775 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1659,7 +1659,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 953a506181e..cc154737115 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -423,6 +423,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4c230ee38ca..ca99953df27 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,6 +2088,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.43.0
[text/x-patch] v29-0002-Parallel-autovacuum.patch (10.5K, 4-v29-0002-Parallel-autovacuum.patch)
download | inline diff:
From 84b220f99866343efb5d5cece5b0392153043f1e Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v29 2/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++++++++++
src/backend/commands/vacuumparallel.c | 20 +++++++++++++------
src/backend/postmaster/autovacuum.c | 14 +++++++++++--
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 ++++++--
src/backend/utils/misc/guc_parameters.dat | 8 ++++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 9 +++++++++
10 files changed, 64 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..055585c38f3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ 0, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..b7ffd854009 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -374,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -555,12 +558,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +607,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 219673db930..ff57d8fca2a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2798,6 +2798,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
int multixact_freeze_table_age;
int log_vacuum_min_duration;
int log_analyze_min_duration;
+ int nparallel_workers = -1; /* disabled by default */
/*
* Calculate the vacuum cost parameters and the freeze ages. If there
@@ -2858,8 +2859,16 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers > 0)
+ nparallel_workers = avopts->autovacuum_parallel_workers;
+ else if (avopts->autovacuum_parallel_workers == -1)
+ nparallel_workers = 0;
+ }
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2868,6 +2877,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_vacuum_min_duration = log_vacuum_min_duration;
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ tab->at_params.nworkers = nparallel_workers;
/*
* Later, in vacuum_rel(), we check reloptions for any
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d77502838c4..534e58a398c 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel autovacuum
+ * workers, and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a5a0edf2534..9bd155e99f6 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel processes per autovacuuming of one table.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e686d88afc4..5e1c62d616c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -710,6 +710,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 5bdbf1530a2..29171efbc1b 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..1981954008e 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,15 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Target number of parallel autovacuum workers. 0 by default disables
+ * parallel vacuum during autovacuum. -1 means choose the parallel degree
+ * based on the number of indexes (the autovacuum_max_parallel_workers
+ * parameter will be used as a limit).
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v29-0004-Tests-for-parallel-autovacuum.patch (11.4K, 5-v29-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 1a7a9afce5ed6bc6a09f27b8b45dbe9a67b08978 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:50:23 +0700
Subject: [PATCH v29 4/5] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 18 ++
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 180 ++++++++++++++++++
8 files changed, 246 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c57432670e7..8d2980f3ef0 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -873,6 +874,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 98aeb66eec4..62b6f50b538 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -46,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -653,6 +654,14 @@ parallel_vacuum_update_shared_delay_params(void)
VacuumUpdateCosts();
shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
/*
@@ -895,6 +904,15 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * This injection point is used to wait until parallel autovacuum workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4ac5c84db43..01fe0041c97 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e2b3eef4136..9dcdc68bc87 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..2f34999d25e
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,180 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table tset_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql('postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until('postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ autovacuum_max_parallel_workers = 4
+ log_min_messages = debug2
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok($node->log_contains(qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-start-parallel-vacuum'
+);
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+$node->wait_for_event(
+ 'autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing'
+);
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$node->wait_for_log(qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
--
2.43.0
[text/x-patch] v29-0001-Logging-for-parallel-autovacuum.patch (8.7K, 6-v29-0001-Logging-for-parallel-autovacuum.patch)
download | inline diff:
From 7029863373bbb61607ebe5b7070bf2cd70de3091 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 19:01:05 +0700
Subject: [PATCH v29 1/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 31 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 23 ++++++++++++++------
src/include/commands/vacuum.h | 28 ++++++++++++++++++++++--
src/tools/pgindent/typedefs.list | 2 ++
4 files changed, 74 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 82c5b28e0ad..c57432670e7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -343,6 +343,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index vacuuming and index cleanup.
+ */
+ PVWorkerUsage worker_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -781,6 +788,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->worker_usage.vacuum.nlaunched = 0;
+ vacrel->worker_usage.vacuum.nplanned = 0;
+ vacrel->worker_usage.cleanup.nlaunched = 0;
+ vacrel->worker_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1135,19 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+
+ if (vacrel->worker_usage.vacuum.nplanned > 0)
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->worker_usage.vacuum.nplanned,
+ vacrel->worker_usage.vacuum.nlaunched);
+
+ if (vacrel->worker_usage.cleanup.nplanned > 0)
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->worker_usage.cleanup.nplanned,
+ vacrel->worker_usage.cleanup.nlaunched);
+
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2669,7 +2694,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &(vacrel->worker_usage.vacuum));
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3103,7 +3129,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &(vacrel->worker_usage.cleanup));
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..77834b96a21 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -225,7 +225,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkerStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -499,7 +499,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -510,7 +510,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wstats);
}
/*
@@ -518,7 +518,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -530,7 +531,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
/*
@@ -607,10 +608,12 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the parallel worker statistics are updated.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkerStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -647,6 +650,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Update the statistics, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -703,6 +710,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Update the statistics, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..953a506181e 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,28 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Statistics for parallel vacuum workers (planned vs. actual)
+ */
+typedef struct PVWorkerStats
+{
+ /* Number of parallel workers planned to launch */
+ int nplanned;
+
+ /* Number of parallel workers that were successfully launched */
+ int nlaunched;
+} PVWorkerStats;
+
+/*
+ * PVWorkerUsage stores information about total number of launched and
+ * planned workers during parallel vacuum (both for index vacuum and cleanup).
+ */
+typedef struct PVWorkerUsage
+{
+ PVWorkerStats vacuum;
+ PVWorkerStats cleanup;
+} PVWorkerUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +416,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkerStats *wstats);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkerStats *wstats);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 52f8603a7be..4c230ee38ca 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,6 +2088,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkerUsage
+PVWorkerStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v28--v29-diff-for-0002.patch (3.9K, 7-v28--v29-diff-for-0002.patch)
download | inline diff:
From fae3dd1f6bd97acb626a85120bb09e2b18b76f98 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 18 Mar 2026 15:03:31 +0700
Subject: [PATCH 4/9] fixup for parallel autovacuum core
---
src/backend/access/common/reloptions.c | 2 +-
src/backend/postmaster/autovacuum.c | 12 +++++++++---
src/backend/utils/misc/guc_parameters.dat | 4 ++--
src/include/utils/rel.h | 7 ++++---
4 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9459a010cc3..055585c38f3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -242,7 +242,7 @@ static relopt_int intRelOpts[] =
RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock
},
- -1, -1, 1024
+ 0, -1, 1024
},
{
{
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f153d0343c8..ff57d8fca2a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2798,6 +2798,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
int multixact_freeze_table_age;
int log_vacuum_min_duration;
int log_analyze_min_duration;
+ int nparallel_workers = -1; /* disabled by default */
/*
* Calculate the vacuum cost parameters and the freeze ages. If there
@@ -2860,9 +2861,13 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
/* Decide whether we need to process indexes of table in parallel. */
- tab->at_params.nworkers = avopts
- ? avopts->autovacuum_parallel_workers
- : -1;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers > 0)
+ nparallel_workers = avopts->autovacuum_parallel_workers;
+ else if (avopts->autovacuum_parallel_workers == -1)
+ nparallel_workers = 0;
+ }
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
@@ -2872,6 +2877,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_vacuum_min_duration = log_vacuum_min_duration;
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ tab->at_params.nworkers = nparallel_workers;
/*
* Later, in vacuum_rel(), we check reloptions for any
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 12393c1214b..9bd155e99f6 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -155,11 +155,11 @@
},
{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
- short_desc => 'Maximum number of parallel workers that a single autovacuum worker can take from bgworkers pool.',
+ short_desc => 'Maximum number of parallel processes per autovacuuming of one table.',
variable => 'autovacuum_max_parallel_workers',
boot_val => '2',
min => '0',
- max => 'MAX_BACKENDS',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
},
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 11dd3aebc6c..1981954008e 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -313,9 +313,10 @@ typedef struct AutoVacOpts
bool enabled;
/*
- * Target number of parallel autovacuum workers. -1 by default disables
- * parallel vacuum during autovacuum. 0 means choose the parallel degree
- * based on the number of indexes.
+ * Target number of parallel autovacuum workers. 0 by default disables
+ * parallel vacuum during autovacuum. -1 means choose the parallel degree
+ * based on the number of indexes (the autovacuum_max_parallel_workers
+ * parameter will be used as a limit).
*/
int autovacuum_parallel_workers;
--
2.43.0
[text/x-patch] v28--v29-diff-for-0005.patch (1.3K, 8-v28--v29-diff-for-0005.patch)
download | inline diff:
From 7de90ab01a711a54510c1f41934555ed2fb4c9e4 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 18 Mar 2026 15:41:55 +0700
Subject: [PATCH 9/9] documentation fixes
---
doc/src/sgml/ref/create_table.sgml | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 4894de021cd..e367310a571 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1728,9 +1728,10 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<para>
Sets the maximum number of parallel autovacuum workers that can process
indexes of this table.
- The default value is -1, which means no parallel index vacuuming for
- this table. If value is 0 then parallel degree will computed based on
- number of indexes.
+ The default value is 0, which means no parallel index vacuuming for
+ this table. If value is -1 then parallel degree will computed based on
+ number of indexes and limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ parameter.
Note that the computed number of workers may not actually be available at
run time. If this occurs, the autovacuum will run with fewer workers
than expected.
--
2.43.0
[text/x-patch] v28--v29-diff-for-0001.patch (9.4K, 9-v28--v29-diff-for-0001.patch)
download | inline diff:
From 1571b7a2eb7dacbd80c697aab937e398c0f70daf Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Mon, 16 Mar 2026 15:09:26 -0700
Subject: [PATCH 2/9] fixup for logging.
---
src/backend/access/heap/vacuumlazy.c | 35 +++++++++++++--------------
src/backend/commands/vacuumparallel.c | 21 +++++++---------
src/include/commands/vacuum.h | 26 ++++++++++----------
src/tools/pgindent/typedefs.list | 4 +--
4 files changed, 41 insertions(+), 45 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index cccaee5b620..c57432670e7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -346,9 +346,9 @@ typedef struct LVRelState
/*
* Total number of planned and actually launched parallel workers for
- * index scans.
+ * index vacuuming and index cleanup.
*/
- PVWorkersUsage workers_usage;
+ PVWorkerUsage worker_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
@@ -788,10 +788,10 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
- vacrel->workers_usage.vacuum.nlaunched = 0;
- vacrel->workers_usage.vacuum.nplanned = 0;
- vacrel->workers_usage.cleanup.nlaunched = 0;
- vacrel->workers_usage.cleanup.nplanned = 0;
+ vacrel->worker_usage.vacuum.nlaunched = 0;
+ vacrel->worker_usage.vacuum.nplanned = 0;
+ vacrel->worker_usage.cleanup.nlaunched = 0;
+ vacrel->worker_usage.cleanup.nplanned = 0;
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
@@ -1135,20 +1135,19 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
- if (vacrel->workers_usage.vacuum.nplanned > 0)
- {
+
+ if (vacrel->worker_usage.vacuum.nplanned > 0)
appendStringInfo(&buf,
_("parallel workers: index vacuum: %d planned, %d launched in total\n"),
- vacrel->workers_usage.vacuum.nplanned,
- vacrel->workers_usage.vacuum.nlaunched);
- }
- if (vacrel->workers_usage.cleanup.nplanned > 0)
- {
+ vacrel->worker_usage.vacuum.nplanned,
+ vacrel->worker_usage.vacuum.nlaunched);
+
+ if (vacrel->worker_usage.cleanup.nplanned > 0)
appendStringInfo(&buf,
_("parallel workers: index cleanup: %d planned, %d launched\n"),
- vacrel->workers_usage.cleanup.nplanned,
- vacrel->workers_usage.cleanup.nlaunched);
- }
+ vacrel->worker_usage.cleanup.nplanned,
+ vacrel->worker_usage.cleanup.nlaunched);
+
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2696,7 +2695,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
vacrel->num_index_scans,
- &vacrel->workers_usage);
+ &(vacrel->worker_usage.vacuum));
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3131,7 +3130,7 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
estimated_count,
- &vacrel->workers_usage);
+ &(vacrel->worker_usage.cleanup));
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 692729efd5e..77834b96a21 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -225,7 +225,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum, PVWorkersStats *wstats);
+ bool vacuum, PVWorkerStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -499,7 +499,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, PVWorkersUsage *wusage)
+ int num_index_scans, PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -510,8 +510,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true,
- &wusage->vacuum);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wstats);
}
/*
@@ -520,7 +519,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
int num_index_scans, bool estimated_count,
- PVWorkersUsage *wusage)
+ PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -532,8 +531,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false,
- &wusage->cleanup);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
/*
@@ -611,12 +609,11 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
*
- * If wstats is not NULL, the statistics it stores will be updated according
- * to what happens during function execution.
+ * If wstats is not NULL, the parallel worker statistics are updated.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum, PVWorkersStats *wstats)
+ bool vacuum, PVWorkerStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -653,7 +650,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
- /* Remember this value, if we asked to */
+ /* Update the statistics, if we asked to */
if (wstats != NULL && nworkers > 0)
wstats->nplanned += nworkers;
@@ -714,7 +711,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
- /* Remember this value, if we asked to */
+ /* Update the statistics, if we asked to */
if (wstats != NULL)
wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1d820915d71..953a506181e 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -301,26 +301,26 @@ typedef struct VacDeadItemsInfo
} VacDeadItemsInfo;
/*
- * Helper for the PVWorkersUsage structure (see below), to avoid repetition.
+ * Statistics for parallel vacuum workers (planned vs. actual)
*/
-typedef struct PVWorkersStats
+typedef struct PVWorkerStats
{
- /* Number of parallel workers we are planned to launch */
+ /* Number of parallel workers planned to launch */
int nplanned;
- /* Number of launched parallel workers */
+ /* Number of parallel workers that were successfully launched */
int nlaunched;
-} PVWorkersStats;
+} PVWorkerStats;
/*
- * PVWorkersUsage stores information about total number of launched and
- * planned workers during parallel vacuum (both for vacuum and cleanup).
+ * PVWorkerUsage stores information about total number of launched and
+ * planned workers during parallel vacuum (both for index vacuum and cleanup).
*/
-typedef struct PVWorkersUsage
+typedef struct PVWorkerUsage
{
- PVWorkersStats vacuum;
- PVWorkersStats cleanup;
-} PVWorkersUsage;
+ PVWorkerStats vacuum;
+ PVWorkerStats cleanup;
+} PVWorkerUsage;
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
@@ -417,12 +417,12 @@ extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- PVWorkersUsage *wusage);
+ PVWorkerStats *wstats);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
bool estimated_count,
- PVWorkersUsage *wusage);
+ PVWorkerStats *wstats);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a67d54e1819..4c230ee38ca 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2088,8 +2088,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
-PVWorkersUsage
-PVWorkersStats
+PVWorkerUsage
+PVWorkerStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v28--v29-diff-for-0004.patch (6.6K, 10-v28--v29-diff-for-0004.patch)
download | inline diff:
From 94988438f0530b5804ff6515af16dba8d9bd3118 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Mon, 16 Mar 2026 18:01:45 -0700
Subject: [PATCH 7/9] fixup: updates tap tests.
---
src/backend/commands/vacuumparallel.c | 9 +--
.../t/001_parallel_autovacuum.pl | 63 +++++++++++--------
2 files changed, 38 insertions(+), 34 deletions(-)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ef36b9bd286..62b6f50b538 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -656,7 +656,7 @@ parallel_vacuum_update_shared_delay_params(void)
shared_params_generation_local = params_generation;
elog(DEBUG2,
- "parallel autovacuum worker cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
vacuum_cost_limit,
vacuum_cost_delay,
VacuumCostPageMiss,
@@ -933,13 +933,6 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
-
- if (AmAutoVacuumWorkerProcess())
- elog(DEBUG2,
- ngettext("autovacuum worker: finished parallel index processing with %d parallel worker",
- "autovacuum worker: finished parallel index processing with %d parallel workers",
- nworkers),
- nworkers);
}
/*
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index 9ad87d48b96..2f34999d25e 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -11,8 +11,8 @@ if ($ENV{enable_injection_points} ne 'yes')
}
# Before each test we should disable autovacuum for 'test_autovac' table and
-# generate some dead tuples in it.
-
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table tset_autovac.
sub prepare_for_next_test
{
my ($node, $test_number) = @_;
@@ -21,12 +21,27 @@ sub prepare_for_next_test
ALTER TABLE test_autovac SET (autovacuum_enabled = false);
UPDATE test_autovac SET col_1 = $test_number;
});
+
+ my $count = $node->safe_psql('postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
}
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until('postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
my $psql_out;
-my $node = PostgreSQL::Test::Cluster->new('node1');
+my $node = PostgreSQL::Test::Cluster->new('main');
$node->init;
# Configure postgres, so it can launch parallel autovacuum workers, log all
@@ -54,7 +69,7 @@ $node->safe_psql('postgres', qq{
CREATE EXTENSION injection_points;
});
-my $indexes_num = 4;
+my $indexes_num = 3;
my $initial_rows_num = 10_000;
my $autovacuum_parallel_workers = 2;
@@ -91,7 +106,8 @@ $node->safe_psql('postgres', qq{
# Our table has enough indexes and appropriate reloptions, so autovacuum must
# be able to process it in parallel mode. Just check if it can do it.
-prepare_for_next_test($node, 1);
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
$node->safe_psql('postgres', qq{
ALTER TABLE test_autovac SET (autovacuum_enabled = true);
@@ -99,16 +115,16 @@ $node->safe_psql('postgres', qq{
# Wait until the parallel autovacuum on table is completed. At the same time,
# we check that the required number of parallel workers has been started.
-$log_start = $node->wait_for_log(
- qr/autovacuum worker: finished parallel index processing with 2 parallel workers/,
- $log_start
-);
+wait_for_autovacuum_complete($node, $av_count);
+ok($node->log_contains(qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
# Test 2:
# Check whether parallel autovacuum leader can propagate cost-based parameters
# to the parallel workers.
-prepare_for_next_test($node, 2);
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
$node->safe_psql('postgres', qq{
SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
@@ -123,8 +139,7 @@ $node->wait_for_event(
'autovacuum-start-parallel-vacuum'
);
-# Reload config - leader worker must update its own parameters during indexes
-# processing
+# Update the shared cost-based delay parameters.
$node->safe_psql('postgres', qq{
ALTER SYSTEM SET vacuum_cost_limit = 500;
ALTER SYSTEM SET vacuum_cost_page_miss = 10;
@@ -133,12 +148,12 @@ $node->safe_psql('postgres', qq{
SELECT pg_reload_conf();
});
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
$node->safe_psql('postgres', qq{
SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
});
-
-# Now wait until parallel autovacuum leader completes processing table (i.e.
-# guaranteed to call vacuum_delay_point) and launches parallel worker.
$node->wait_for_event(
'autovacuum worker',
'autovacuum-leader-before-indexes-processing'
@@ -146,24 +161,20 @@ $node->wait_for_event(
# Check whether parallel worker successfully updated all parameters during
# index processing
-$log_start = $node->wait_for_log(
- qr/parallel autovacuum worker cost params: cost_limit=500, cost_delay=2, / .
- qr/cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
- $log_start
-);
+$node->wait_for_log(qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
-# Cleanup
$node->safe_psql('postgres', qq{
SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+# Cleanup
+$node->safe_psql('postgres', qq{
SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
-
- ALTER TABLE test_autovac SET (autovacuum_parallel_workers = $autovacuum_parallel_workers);
});
-# We were able to get to this point, so everything is fine.
-ok(1);
-
$node->stop;
done_testing();
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-18 19:49 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-18 19:49 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed, Mar 18, 2026 at 2:23 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Tue, Mar 17, 2026 at 11:51 PM Masahiko Sawada <[email protected]> wrote:
> >
> > I find the current behavior of the autovacuum_parallel_workers storage
> > parameter somewhat unintuitive for users. The documentation currently
> > states:
> >
> > + <para>
> > + Sets the maximum number of parallel autovacuum workers that can process
> > + indexes of this table.
> > + The default value is -1, which means no parallel index vacuuming for
> > + this table. If value is 0 then parallel degree will computed based on
> > + number of indexes.
> > + Note that the computed number of workers may not actually be available at
> > + run time. If this occurs, the autovacuum will run with fewer workers
> > + than expected.
> > + </para>
> >
> > It is quite confusing that setting the value to 0 does not actually
> > disable the parallel vacuum. In many other PostgreSQL parameters, 0
> > typically means "off" or "no workers." I think that this parameter
> > should behave as follows:
> >
> > -1: Use the value of autovacuum_max_parallel_workers (GUC) as the
> > limit (fallback).
> > >=0: Use the specified value as the limit, capped by autovacuum_max_parallel_workers. (Specifically, setting this to 0 would disable parallel vacuum for the table).
> >
>
> Actually we have several places in the code where "-1" means disabled and "0"
> means choosing a parallel degree based on the number of indexes. Since this
> is an inner logic, I agree that we should make our parameter more intuitive
> to the user. But this will make the code a bit confusing.
Yes, we already have such a code for PARALLEL option for the VACUUM command:
/*
* Disable parallel vacuum, if user has specified parallel degree
* as zero.
*/
if (nworkers == 0)
params.nworkers = -1;
else
params.nworkers = nworkers;
I guess it's better that autovacuum codes also somewhat follow this
code for better consistency.
>
> > Currently, the patch implements parallel autovacuum as an "opt-in"
> > style. That is, even after setting the GUC to >0, users must manually
> > set the storage parameter for each table. This assumes that users
> > already know exactly which tables need parallel vacuum.
> >
> > However, I believe it would be more intuitive to let the system decide
> > which tables are eligible for parallel vacuum based on index size and
> > count (via min_parallel_index_scan_size, etc.), rather than forcing
> > manual per-table configuration. Therefore, I'm thinking we might want
> > to make it "opt-out" style by default instead:
> >
> > - Set the default value of the storage parameter to -1 (i.e., fallback to GUC).
> > - the default value of the GUC autovacuum_max_parallel_workers at 0.
> >
> > With this configuration:
> >
> > - Parallel autovacuum is disabled by default.
> > - Users can enable it globally by simply setting the GUC to >0.
> > - Users can still disable it for specific tables by setting the
> > storage parameter to 0.
> >
> > What do you think?
>
> I'm afraid that I can't agree with you here. As I wrote above [1], the
> parallel a/v feature will be useful when a user has a few huge tables with
> a big amount of indexes. Only these tables require parallel processing and a
> user knows about it.
Isn't it a case where users need to increase
min_parallel_index_scan_size? Suppose that there are two tables that
are big enough and have enough indexes, it's more natural to me to use
parallel vacuum for both tables without user manual settings.
> If we implement the feature as you suggested, then after setting the
> av_max_parallel_workers to N > 0, the user will have to manually disable
> processing for all tables except the largest ones. This will need to be done
> to ensure that parallel workers are launched specifically to process the
> largest tables and not wasting on the processing of little ones.
>
> I.e. I'm proposing a design that will require manual actions to *enable*
> parallel a/v for several large tables rather than *disable* it for all of
> the rest tables in the cluster. I'm sure that's what users want.
>
> Allowing the system to decide which tables to process in parallel is a good
> way from a design perspective. But I'm thinking of the following example :
> Imagine that we have a threshold, when exceeded, parallel a/v is used.
> Several a/v workers encounter tables which exceed this threshold by 1_000 and
> each of these workers decides to launch a few parallel workers. Another a/v
> worker encounters a table which is beyond this threshold by 1_000_000 and
> tries to launch N parallel workers, but facing the max_parallel_workers
> shortage. Thus, processing of this table will take a very long time to
> complete due to lack of resources. The only way for users to avoid it is to
> disable parallel a/v for all tables, which exceeds the threshold and are not
> of particular interest.
I think the same thing happens even with the current design as long as
users misconfigure max_parallel_workers, no? Setting
autovacuum_max_parallel_workers to >0 would mean that users want to
give additional resources for autovacuums in general, I think it makes
sense to use parallel vacuum even for tables which exceed the
threshold by 1000.
Users who want to use parallel autovacuum would have to set
max_parallel_workers (and max_worker_processes) high enough so that
each autovacuum worker can use parallel workers. If resource
contention occurs, it's a sign that the limits are not configured
properly.
> >
> > +{ name => 'autovacuum_max_parallel_workers', type => 'int', context
> > => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> > + short_desc => 'Maximum number of parallel workers that a single
> > autovacuum worker can take from bgworkers pool.',
> > + variable => 'autovacuum_max_parallel_workers',
> > + boot_val => '2',
> > + min => '0',
> > + max => 'MAX_BACKENDS',
> > +},
> >
> > How about rephrasing the short description to "Maximum number of
> > parallel processes per autovacuum operation."?
>
> I'm not sure if this phrase will be understandable to the user.
> I don't see any places where we would define the "autovacuum operation"
> concept, so I suppose it could be ambiguous. What about "Maximum number of
> parallel processes per autovacuuming of one table"?
"autovacuuming of one table" sounds unnatural to me. How about
"Maximum number of parallel workers that can be used by a single
autovacuum worker."?
>
> > We check only the server logs throughout the new tap tests. I think we
> > should also confirm that the autovacuum successfully completes. I've
> > attached the proposed change to the tap tests.
> >
>
> I agree with proposed changes. BTW, don't we need to reduce the strings
> length to 80 characters in the tests? In some tests, this rule is followed,
> and in some it is not.
Yeah, pgperltidy should be run for new tests.
> Thank you very much for the review and proposed patches!
> Please, see an updated set of patches. Note that the "logging for autovacuum"
> is considered as the first patch now.
Thank you for updating the patches!
The 0001 patch looks good to me. I've updated the commit message and
attached it. I'm going to push the patch, barring any objections.
While we need more discussion on the above points (opt-in vs.
opt-out), I think that the rest of the patches are getting close.
Regarding the documentation changes, I find that the current patch
needs more explanation at appropriate sections. I think we need to:
1. describe the new autovacuum_max_parallel_workers GUC parameter (in
config.sgml)
2. describe the new autovacuum_parallel_workers storage parameter (in
create_table.sgml)
3. mention that autovacuum could use parallel vacuum (in maintenance.sgml).
I think that part 1 should include the basic explanation of the GUC
parameter as well as how the number of workers is decided (which could
be similar to the description for PARALLEL options of the VACUUM
command). Part 2 can explain the storage parameter as follow:
Per-table value for <xref linkend="guc-autovacuum-max-parallel-workers"/>
parameter. If -1 is specified,
<varname>autovacuum_max_parallel_workers</varname>
value will be used. The default value is 0.
Part 3 can briefly mention that autovacuum can perform parallel vacuum
with parallel workers capped by autovacuum_max_parallel_workers as
follow:
For tables with the <xref linkend="reloption-autovacuum-parallel-workers"/>
storage parameter set, an autovacuum worker can perform index vacuuming and
index cleanup with background workers. The number of workers launched by
a single autovacuum worker is limited by the
<xref linkend="guc-autovacuum-max-parallel-workers"/>.
What do you think?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-patch] v30-0001-Add-parallel-vacuum-worker-usage-to-VACUUM-VERBO.patch (9.5K, 2-v30-0001-Add-parallel-vacuum-worker-usage-to-VACUUM-VERBO.patch)
download | inline diff:
From 31592aa698726b0ec3a72d5c2bac59f5ef9f2806 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 19:01:05 +0700
Subject: [PATCH v30] Add parallel vacuum worker usage to VACUUM (VERBOSE) and
autovacuum logs.
This commit adds both the number of parallel workers planned and the
number of parallel workers actually launched to the output of
VACUUM (VERBOSE) and autovacuum logs.
Previously, this information was only reported as an INFO message
during VACUUM (VERBOSE), which meant it was not included in autovacuum
logs in practice. Although autovacuum does not yet support parallel
vacuum, a subsequent patch will enable it and utilize these logs in
its regression tests. This change also improves observability by
making it easier to verify if parallel vacuum is utilizing the
expected number of workers.
Author: Daniil Davydov <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Sami Imseih <[email protected]>
Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com
---
src/backend/access/heap/vacuumlazy.c | 31 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 23 ++++++++++++++------
src/include/commands/vacuum.h | 28 ++++++++++++++++++++++--
src/tools/pgindent/typedefs.list | 2 ++
4 files changed, 74 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 82c5b28e0ad..c57432670e7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -343,6 +343,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index vacuuming and index cleanup.
+ */
+ PVWorkerUsage worker_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -781,6 +788,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->worker_usage.vacuum.nlaunched = 0;
+ vacrel->worker_usage.vacuum.nplanned = 0;
+ vacrel->worker_usage.cleanup.nlaunched = 0;
+ vacrel->worker_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1135,19 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+
+ if (vacrel->worker_usage.vacuum.nplanned > 0)
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->worker_usage.vacuum.nplanned,
+ vacrel->worker_usage.vacuum.nlaunched);
+
+ if (vacrel->worker_usage.cleanup.nplanned > 0)
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->worker_usage.cleanup.nplanned,
+ vacrel->worker_usage.cleanup.nlaunched);
+
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2669,7 +2694,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &(vacrel->worker_usage.vacuum));
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3103,7 +3129,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &(vacrel->worker_usage.cleanup));
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..77834b96a21 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -225,7 +225,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkerStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -499,7 +499,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -510,7 +510,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wstats);
}
/*
@@ -518,7 +518,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -530,7 +531,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
/*
@@ -607,10 +608,12 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the parallel worker statistics are updated.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkerStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -647,6 +650,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Update the statistics, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -703,6 +710,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Update the statistics, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..953a506181e 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,28 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Statistics for parallel vacuum workers (planned vs. actual)
+ */
+typedef struct PVWorkerStats
+{
+ /* Number of parallel workers planned to launch */
+ int nplanned;
+
+ /* Number of parallel workers that were successfully launched */
+ int nlaunched;
+} PVWorkerStats;
+
+/*
+ * PVWorkerUsage stores information about total number of launched and
+ * planned workers during parallel vacuum (both for index vacuum and cleanup).
+ */
+typedef struct PVWorkerUsage
+{
+ PVWorkerStats vacuum;
+ PVWorkerStats cleanup;
+} PVWorkerUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +416,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkerStats *wstats);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkerStats *wstats);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 174e2798443..a847d37b526 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2090,6 +2090,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkerUsage
+PVWorkerStats
PX_Alias
PX_Cipher
PX_Combo
--
2.53.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-19 14:28 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-19 14:28 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Thu, Mar 19, 2026 at 2:49 AM Masahiko Sawada <[email protected]> wrote:
>
> Yes, we already have such a code for PARALLEL option for the VACUUM command.
>
> I guess it's better that autovacuum codes also somewhat follow this
> code for better consistency.
>
I agree. You can find it in the v29-0002 patch.
> > I'm afraid that I can't agree with you here. As I wrote above [1], the
> > parallel a/v feature will be useful when a user has a few huge tables with
> > a big amount of indexes. Only these tables require parallel processing and a
> > user knows about it.
>
> Isn't it a case where users need to increase
> min_parallel_index_scan_size? Suppose that there are two tables that
> are big enough and have enough indexes, it's more natural to me to use
> parallel vacuum for both tables without user manual settings.
>
Do you mean that the user can increase this parameter so that smaller tables
are not considered for the parallel a/v? If so, I don't think it will always
be handy. When I say "smaller tables" I mean that they are small relative to
super huge tables. But actually these "smaller tables" can be pretty big and
require a parallel index scan within parallel queries or VACUUM PARALLEL (not
an autovacuum). Increasing the min_scan_size parameter can decrease
performance of the queries that are relying on the ability to scan indexes
of such tables in parallel. Separated parameter such as
"autovacuum_min_parallel_index_scan_size" could help here, but I don't think
that we want to introduce many new GUC parameters for a single feature.
> > If we implement the feature as you suggested, then after setting the
> > av_max_parallel_workers to N > 0, the user will have to manually disable
> > processing for all tables except the largest ones. This will need to be done
> > to ensure that parallel workers are launched specifically to process the
> > largest tables and not wasting on the processing of little ones.
> >
> > I.e. I'm proposing a design that will require manual actions to *enable*
> > parallel a/v for several large tables rather than *disable* it for all of
> > the rest tables in the cluster. I'm sure that's what users want.
> >
> > Allowing the system to decide which tables to process in parallel is a good
> > way from a design perspective. But I'm thinking of the following example :
> > Imagine that we have a threshold, when exceeded, parallel a/v is used.
> > Several a/v workers encounter tables which exceed this threshold by 1_000 and
> > each of these workers decides to launch a few parallel workers. Another a/v
> > worker encounters a table which is beyond this threshold by 1_000_000 and
> > tries to launch N parallel workers, but facing the max_parallel_workers
> > shortage. Thus, processing of this table will take a very long time to
> > complete due to lack of resources. The only way for users to avoid it is to
> > disable parallel a/v for all tables, which exceeds the threshold and are not
> > of particular interest.
>
> I think the same thing happens even with the current design as long as
> users misconfigure max_parallel_workers, no? Setting
> autovacuum_max_parallel_workers to >0 would mean that users want to
> give additional resources for autovacuums in general, I think it makes
> sense to use parallel vacuum even for tables which exceed the
> threshold by 1000.
>
> Users who want to use parallel autovacuum would have to set
> max_parallel_workers (and max_worker_processes) high enough so that
> each autovacuum worker can use parallel workers. If resource
> contention occurs, it's a sign that the limits are not configured
> properly.
>
Yeah, currently user can misconfigure max_parallel_workers, so (for example)
multiple VACUUM PARALLEL operations running at the same time will face with
a shortage of parallel workers. But I guess that every system has some sane
limit for this parameter's value. If we want to ensure that all a/v leaders
are guaranteed to launch as many parallel workers as required, we might need
to increase the max_parallel_workers too much (and cross the sane limit).
IMHO it may be unacceptable for many systems in production, because it will
undermine the stability.
I don't have direct evidence of my words, so I'll try to get the opinion of
the people who will use the parallel a/v feature in big productions.
> > I'm not sure if this phrase will be understandable to the user.
> > I don't see any places where we would define the "autovacuum operation"
> > concept, so I suppose it could be ambiguous. What about "Maximum number of
> > parallel processes per autovacuuming of one table"?
>
> "autovacuuming of one table" sounds unnatural to me. How about
> "Maximum number of parallel workers that can be used by a single
> autovacuum worker."?
>
It sounds good, I agree.
> >
> > > We check only the server logs throughout the new tap tests. I think we
> > > should also confirm that the autovacuum successfully completes. I've
> > > attached the proposed change to the tap tests.
> > >
> >
> > I agree with proposed changes. BTW, don't we need to reduce the strings
> > length to 80 characters in the tests? In some tests, this rule is followed,
> > and in some it is not.
>
> Yeah, pgperltidy should be run for new tests.
>
OK. I'll do it.
> The 0001 patch looks good to me. I've updated the commit message and
> attached it. I'm going to push the patch, barring any objections.
>
Great news!
> Regarding the documentation changes, I find that the current patch
> needs more explanation at appropriate sections. I think we need to:
>
> 1. describe the new autovacuum_max_parallel_workers GUC parameter (in
> config.sgml)
> 2. describe the new autovacuum_parallel_workers storage parameter (in
> create_table.sgml)
> 3. mention that autovacuum could use parallel vacuum (in maintenance.sgml).
>
I agree.
> I think that part 1 should include the basic explanation of the GUC
> parameter as well as how the number of workers is decided (which could
> be similar to the description for PARALLEL options of the VACUUM
> command).
IMHO, the description of the method for determining the number of parallel
workers will look more appropriate in part 3.
BTW, do we need to mention that this parameter can be overridden by the
per-table setting?
> Part 2 can explain the storage parameter as follow:
>
> Per-table value for <xref linkend="guc-autovacuum-max-parallel-workers"/>
> parameter. If -1 is specified,
> <varname>autovacuum_max_parallel_workers</varname>
> value will be used. The default value is 0.
>
It looks very compact and beautiful, I agree.
Actually, if -1 is specified then we are "choosing the parallel degree based
on the number of indexes". We have several places in the code with such
phrasing. I don't really like it because 1) even if value != -1 we are still
taking the number of indexes into account and 2) basically it is the same as
to say "limited by GUC parameter". I don't want to touch existing comments
in the vacuumparallel.c but in our patch I'd like to say that "GUC parameter's
value will be used". I hope this will not cause any misunderstanding among
readers.
> Part 3 can briefly mention that autovacuum can perform parallel vacuum
> with parallel workers capped by autovacuum_max_parallel_workers as
> follow:
>
> For tables with the <xref linkend="reloption-autovacuum-parallel-workers"/>
> storage parameter set, an autovacuum worker can perform index vacuuming and
> index cleanup with background workers. The number of workers launched by
> a single autovacuum worker is limited by the
> <xref linkend="guc-autovacuum-max-parallel-workers"/>.
I suggest adding here also a description of the method for calculating the
number of parallel workers. If so, I feel that this part of documentation will
be completely the same as in VACUUM PARALLEL (except a few little details).
Maybe we can create some dedicated subchapter in the "Routine vacuuming" where
we describe how the number of parallel workers is decided. Lets call it
something like "24.1.7 Parallel Vacuuming". Both VACUUM PARALLEL and parallel
autovacuum can refer to this subchapter. I think it will be much easier to
maintain. What do you think?
--
Thank you very much for the comments and prepared patch!
Please, see an updated set of patches (I didn't touch patches 0001, 0003 and
0005).
The 0001 patch contains a pretty controversial fix for the
"autovacuum_parallel_workers" description, but I didn't come up with anything
better.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v30-0003-Cost-based-parameters-propagation-for-parallel-a.patch (11.0K, 2-v30-0003-Cost-based-parameters-propagation-for-parallel-a.patch)
download | inline diff:
From 31812fa9b922bad041ceb90a6ff6e0814a5e1f77 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 15 Jan 2026 23:15:48 +0700
Subject: [PATCH v30 3/5] Cost based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 21 +++-
src/backend/commands/vacuumparallel.c | 163 ++++++++++++++++++++++++++
src/backend/postmaster/autovacuum.c | 2 +-
src/include/commands/vacuum.h | 2 +
src/tools/pgindent/typedefs.list | 1 +
5 files changed, 186 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index b7ffd854009..98aeb66eec4 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -18,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -53,6 +60,31 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -122,6 +154,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -224,6 +268,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments in the PVSharedCostParams for the details */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -235,6 +284,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -395,6 +445,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -460,6 +525,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -537,6 +605,95 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the wokrer is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -1078,6 +1235,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
@@ -1127,6 +1287,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e810e1303db..f0535a0997f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1659,7 +1659,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 953a506181e..cc154737115 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -423,6 +423,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a4a2ed07816..d5c7b91e167 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2090,6 +2090,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.43.0
[text/x-patch] v30-0002-Parallel-autovacuum.patch (10.4K, 3-v30-0002-Parallel-autovacuum.patch)
download | inline diff:
From b3409be9b386a2ddd4778c34bd71eca34bf48332 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v30 2/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++++++++++
src/backend/commands/vacuumparallel.c | 20 +++++++++++++------
src/backend/postmaster/autovacuum.c | 18 +++++++++++++++--
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 ++++++--
src/backend/utils/misc/guc_parameters.dat | 8 ++++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 2 ++
10 files changed, 61 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 237ab8d0ed9..03e6fae930e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -235,6 +235,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Overrides value of the autovacuum_max_parallel_workers parameter for this table, if > -1.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ 0, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1968,6 +1977,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..b7ffd854009 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -374,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -555,12 +558,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +607,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 219673db930..e810e1303db 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2798,6 +2798,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
int multixact_freeze_table_age;
int log_vacuum_min_duration;
int log_analyze_min_duration;
+ int nparallel_workers = -1; /* disabled by default */
/*
* Calculate the vacuum cost parameters and the freeze ages. If there
@@ -2858,8 +2859,20 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers > 0)
+ nparallel_workers = avopts->autovacuum_parallel_workers;
+ else if (avopts->autovacuum_parallel_workers == -1)
+ {
+ nparallel_workers = autovacuum_max_parallel_workers > 0
+ ? autovacuum_max_parallel_workers
+ : -1; /* disable parallelism if parameter's value is 0 */
+ }
+ }
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2868,6 +2881,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_vacuum_min_duration = log_vacuum_min_duration;
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ tab->at_params.nworkers = nparallel_workers;
/*
* Later, in vacuum_rel(), we check reloptions for any
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e1546d9c97a..45b39b7c47f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3358,9 +3358,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel autovacuum
+ * workers, and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0c9854ad8fc..3d2fd35a004 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e4abe6c0077..11d96f4dd4f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 5bdbf1530a2..29171efbc1b 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ int autovacuum_parallel_workers;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
[text/x-patch] v30-0005-Documentation-for-parallel-autovacuum.patch (4.5K, 4-v30-0005-Documentation-for-parallel-autovacuum.patch)
download | inline diff:
From dc0181fd585ae46c55f209ea6f56cbdff40b35de Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 03:23:38 +0700
Subject: [PATCH v30 5/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 21 +++++++++++++++++++++
3 files changed, 51 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8cdd826fbd3..7741796c6b0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9395,6 +9396,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time by a single
+ autovacuum worker. Is capped by <xref linkend="guc-max-parallel-workers"/>.
+ The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..f2a280db569 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The number of parallel workers that can be taken from pool by a single
+ autovacuum worker is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 982532fe725..e367310a571 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1718,6 +1718,27 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is 0, which means no parallel index vacuuming for
+ this table. If value is -1 then parallel degree will computed based on
+ number of indexes and limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ parameter.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
[text/x-patch] v30-0001-Add-parallel-vacuum-worker-usage-to-VACUUM-VERBO.patch (9.5K, 5-v30-0001-Add-parallel-vacuum-worker-usage-to-VACUUM-VERBO.patch)
download | inline diff:
From 9941505b9dedb447de37940793e68999c06e7be7 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 16 Mar 2026 19:01:05 +0700
Subject: [PATCH v30 1/5] Add parallel vacuum worker usage to VACUUM (VERBOSE)
and autovacuum logs.
This commit adds both the number of parallel workers planned and the
number of parallel workers actually launched to the output of
VACUUM (VERBOSE) and autovacuum logs.
Previously, this information was only reported as an INFO message
during VACUUM (VERBOSE), which meant it was not included in autovacuum
logs in practice. Although autovacuum does not yet support parallel
vacuum, a subsequent patch will enable it and utilize these logs in
its regression tests. This change also improves observability by
making it easier to verify if parallel vacuum is utilizing the
expected number of workers.
Author: Daniil Davydov <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Sami Imseih <[email protected]>
Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com
---
src/backend/access/heap/vacuumlazy.c | 31 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 23 ++++++++++++++------
src/include/commands/vacuum.h | 28 ++++++++++++++++++++++--
src/tools/pgindent/typedefs.list | 2 ++
4 files changed, 74 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 82c5b28e0ad..c57432670e7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -343,6 +343,13 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Total number of planned and actually launched parallel workers for
+ * index vacuuming and index cleanup.
+ */
+ PVWorkerUsage worker_usage;
+
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -781,6 +788,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->new_all_visible_all_frozen_pages = 0;
vacrel->new_all_frozen_pages = 0;
+ vacrel->worker_usage.vacuum.nlaunched = 0;
+ vacrel->worker_usage.vacuum.nplanned = 0;
+ vacrel->worker_usage.cleanup.nlaunched = 0;
+ vacrel->worker_usage.cleanup.nplanned = 0;
+
/*
* Get cutoffs that determine which deleted tuples are considered DEAD,
* not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
@@ -1123,6 +1135,19 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+
+ if (vacrel->worker_usage.vacuum.nplanned > 0)
+ appendStringInfo(&buf,
+ _("parallel workers: index vacuum: %d planned, %d launched in total\n"),
+ vacrel->worker_usage.vacuum.nplanned,
+ vacrel->worker_usage.vacuum.nlaunched);
+
+ if (vacrel->worker_usage.cleanup.nplanned > 0)
+ appendStringInfo(&buf,
+ _("parallel workers: index cleanup: %d planned, %d launched\n"),
+ vacrel->worker_usage.cleanup.nplanned,
+ vacrel->worker_usage.cleanup.nlaunched);
+
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2669,7 +2694,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ &(vacrel->worker_usage.vacuum));
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3103,7 +3129,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ &(vacrel->worker_usage.cleanup));
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 279108ca89f..77834b96a21 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -225,7 +225,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkerStats *wstats);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -499,7 +499,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -510,7 +510,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wstats);
}
/*
@@ -518,7 +518,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkerStats *wstats)
{
Assert(!IsParallelWorker());
@@ -530,7 +531,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
/*
@@ -607,10 +608,12 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
/*
* Perform index vacuum or index cleanup with parallel workers. This function
* must be used by the parallel vacuum leader process.
+ *
+ * If wstats is not NULL, the parallel worker statistics are updated.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkerStats *wstats)
{
int nworkers;
PVIndVacStatus new_status;
@@ -647,6 +650,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /* Update the statistics, if we asked to */
+ if (wstats != NULL && nworkers > 0)
+ wstats->nplanned += nworkers;
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -703,6 +710,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Enable shared cost balance for leader backend */
VacuumSharedCostBalance = &(pvs->shared->cost_balance);
VacuumActiveNWorkers = &(pvs->shared->active_nworkers);
+
+ /* Update the statistics, if we asked to */
+ if (wstats != NULL)
+ wstats->nlaunched += pvs->pcxt->nworkers_launched;
}
if (vacuum)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..953a506181e 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,28 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * Statistics for parallel vacuum workers (planned vs. actual)
+ */
+typedef struct PVWorkerStats
+{
+ /* Number of parallel workers planned to launch */
+ int nplanned;
+
+ /* Number of parallel workers that were successfully launched */
+ int nlaunched;
+} PVWorkerStats;
+
+/*
+ * PVWorkerUsage stores information about total number of launched and
+ * planned workers during parallel vacuum (both for index vacuum and cleanup).
+ */
+typedef struct PVWorkerUsage
+{
+ PVWorkerStats vacuum;
+ PVWorkerStats cleanup;
+} PVWorkerUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +416,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkerStats *wstats);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkerStats *wstats);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4673eca9cd6..a4a2ed07816 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2090,6 +2090,8 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVWorkerUsage
+PVWorkerStats
PX_Alias
PX_Cipher
PX_Combo
--
2.43.0
[text/x-patch] v30-0004-Tests-for-parallel-autovacuum.patch (11.4K, 6-v30-0004-Tests-for-parallel-autovacuum.patch)
download | inline diff:
From 8becbcbb62c89537955de3edb5e7a568c244b631 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:50:23 +0700
Subject: [PATCH v30 4/5] Tests for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuumparallel.c | 18 ++
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 191 ++++++++++++++++++
8 files changed, 257 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c57432670e7..8d2980f3ef0 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -873,6 +874,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 98aeb66eec4..62b6f50b538 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -46,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -653,6 +654,14 @@ parallel_vacuum_update_shared_delay_params(void)
VacuumUpdateCosts();
shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
}
/*
@@ -895,6 +904,15 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * This injection point is used to wait until parallel autovacuum workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..336a212faf4 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..929659956cb 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..0364019d5f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,191 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table tset_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql(
+ 'postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until(
+ 'postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf(
+ 'postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ autovacuum_max_parallel_workers = 4
+ log_min_messages = debug2
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql(
+ 'postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing');
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
--
2.43.0
[text/x-patch] v29--v30-diff-for-0004.patch (5.7K, 7-v29--v30-diff-for-0004.patch)
download | inline diff:
From 2490e0f492096d33f56eb8d0a2a3da35434dfa1a Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 19 Mar 2026 21:25:40 +0700
Subject: [PATCH] fixes for 0004
---
.../t/001_parallel_autovacuum.pl | 61 +++++++++++--------
1 file changed, 36 insertions(+), 25 deletions(-)
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index 2f34999d25e..0364019d5f0 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -17,12 +17,14 @@ sub prepare_for_next_test
{
my ($node, $test_number) = @_;
- $node->safe_psql('postgres', qq{
+ $node->safe_psql(
+ 'postgres', qq{
ALTER TABLE test_autovac SET (autovacuum_enabled = false);
UPDATE test_autovac SET col_1 = $test_number;
});
- my $count = $node->safe_psql('postgres', qq{
+ my $count = $node->safe_psql(
+ 'postgres', qq{
SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
});
@@ -34,7 +36,8 @@ sub wait_for_autovacuum_complete
{
my ($node, $old_count) = @_;
- $node->poll_query_until('postgres', qq{
+ $node->poll_query_until(
+ 'postgres', qq{
SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
});
}
@@ -46,7 +49,8 @@ $node->init;
# Configure postgres, so it can launch parallel autovacuum workers, log all
# information we are interested in and autovacuum works frequently
-$node->append_conf('postgresql.conf', qq{
+$node->append_conf(
+ 'postgresql.conf', qq{
max_worker_processes = 20
max_parallel_workers = 20
autovacuum_max_parallel_workers = 4
@@ -65,7 +69,8 @@ if (!$node->check_extension('injection_points'))
}
# Create all functions needed for testing
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
CREATE EXTENSION injection_points;
});
@@ -74,7 +79,8 @@ my $initial_rows_num = 10_000;
my $autovacuum_parallel_workers = 2;
# Create table and fill it with some data
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
CREATE TABLE test_autovac (
id SERIAL PRIMARY KEY,
col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
@@ -91,7 +97,8 @@ $node->safe_psql('postgres', qq{
});
# Create specified number of b-tree indexes on the table
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
DO \$\$
DECLARE
i INTEGER;
@@ -109,15 +116,17 @@ $node->safe_psql('postgres', qq{
my $av_count = prepare_for_next_test($node, 1);
my $log_offset = -s $node->logfile;
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
ALTER TABLE test_autovac SET (autovacuum_enabled = true);
});
# Wait until the parallel autovacuum on table is completed. At the same time,
# we check that the required number of parallel workers has been started.
wait_for_autovacuum_complete($node, $av_count);
-ok($node->log_contains(qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
- $log_offset));
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
# Test 2:
# Check whether parallel autovacuum leader can propagate cost-based parameters
@@ -126,7 +135,8 @@ ok($node->log_contains(qr/parallel workers: index vacuum: 2 planned, 2 launched
$av_count = prepare_for_next_test($node, 2);
$log_offset = -s $node->logfile;
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
@@ -134,13 +144,12 @@ $node->safe_psql('postgres', qq{
});
# Wait until parallel autovacuum is inited
-$node->wait_for_event(
- 'autovacuum worker',
- 'autovacuum-start-parallel-vacuum'
-);
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
# Update the shared cost-based delay parameters.
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
ALTER SYSTEM SET vacuum_cost_limit = 500;
ALTER SYSTEM SET vacuum_cost_page_miss = 10;
ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
@@ -151,27 +160,29 @@ $node->safe_psql('postgres', qq{
# Resume the leader process to update the shared parameters during heap scan (i.e.
# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
# before vacuuming indexes due to the injection point.
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
});
-$node->wait_for_event(
- 'autovacuum worker',
- 'autovacuum-leader-before-indexes-processing'
-);
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing');
# Check whether parallel worker successfully updated all parameters during
# index processing
-$node->wait_for_log(qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
- $log_offset);
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
});
wait_for_autovacuum_complete($node, $av_count);
# Cleanup
-$node->safe_psql('postgres', qq{
+$node->safe_psql(
+ 'postgres', qq{
SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
});
--
2.43.0
[text/x-patch] v29--v30-diff-for-0002.patch (3.0K, 8-v29--v30-diff-for-0002.patch)
download | inline diff:
From 029354cf40fea428c20de09ccacc6afe503c73f6 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 19 Mar 2026 21:19:35 +0700
Subject: [PATCH] fixes for 0002
---
src/backend/access/common/reloptions.c | 2 +-
src/backend/postmaster/autovacuum.c | 6 +++++-
src/backend/utils/misc/guc_parameters.dat | 2 +-
src/include/utils/rel.h | 7 -------
4 files changed, 7 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 055585c38f3..03e6fae930e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -238,7 +238,7 @@ static relopt_int intRelOpts[] =
{
{
"autovacuum_parallel_workers",
- "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ "Overrides value of the autovacuum_max_parallel_workers parameter for this table, if > -1.",
RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock
},
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ff57d8fca2a..e810e1303db 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2866,7 +2866,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
if (avopts->autovacuum_parallel_workers > 0)
nparallel_workers = avopts->autovacuum_parallel_workers;
else if (avopts->autovacuum_parallel_workers == -1)
- nparallel_workers = 0;
+ {
+ nparallel_workers = autovacuum_max_parallel_workers > 0
+ ? autovacuum_max_parallel_workers
+ : -1; /* disable parallelism if parameter's value is 0 */
+ }
}
tab->at_params.freeze_min_age = freeze_min_age;
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index bc23ddf5201..3d2fd35a004 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -155,7 +155,7 @@
},
{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
- short_desc => 'Maximum number of parallel processes per autovacuuming of one table.',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
variable => 'autovacuum_max_parallel_workers',
boot_val => '2',
min => '0',
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 1981954008e..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -312,14 +312,7 @@ typedef struct AutoVacOpts
{
bool enabled;
- /*
- * Target number of parallel autovacuum workers. 0 by default disables
- * parallel vacuum during autovacuum. -1 means choose the parallel degree
- * based on the number of indexes (the autovacuum_max_parallel_workers
- * parameter will be used as a limit).
- */
int autovacuum_parallel_workers;
-
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-19 23:58 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-19 23:58 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Thu, Mar 19, 2026 at 7:29 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Thu, Mar 19, 2026 at 2:49 AM Masahiko Sawada <[email protected]> wrote:
> >
> > Yes, we already have such a code for PARALLEL option for the VACUUM command.
> >
> > I guess it's better that autovacuum codes also somewhat follow this
> > code for better consistency.
> >
>
> I agree. You can find it in the v29-0002 patch.
>
> > > I'm afraid that I can't agree with you here. As I wrote above [1], the
> > > parallel a/v feature will be useful when a user has a few huge tables with
> > > a big amount of indexes. Only these tables require parallel processing and a
> > > user knows about it.
> >
> > Isn't it a case where users need to increase
> > min_parallel_index_scan_size? Suppose that there are two tables that
> > are big enough and have enough indexes, it's more natural to me to use
> > parallel vacuum for both tables without user manual settings.
> >
>
> Do you mean that the user can increase this parameter so that smaller tables
> are not considered for the parallel a/v? If so, I don't think it will always
> be handy. When I say "smaller tables" I mean that they are small relative to
> super huge tables. But actually these "smaller tables" can be pretty big and
> require a parallel index scan within parallel queries or VACUUM PARALLEL (not
> an autovacuum).
I think that if these small tables are actually big, these are also
eligible for using parallel autovacuums.
> > > If we implement the feature as you suggested, then after setting the
> > > av_max_parallel_workers to N > 0, the user will have to manually disable
> > > processing for all tables except the largest ones. This will need to be done
> > > to ensure that parallel workers are launched specifically to process the
> > > largest tables and not wasting on the processing of little ones.
> > >
> > > I.e. I'm proposing a design that will require manual actions to *enable*
> > > parallel a/v for several large tables rather than *disable* it for all of
> > > the rest tables in the cluster. I'm sure that's what users want.
> > >
> > > Allowing the system to decide which tables to process in parallel is a good
> > > way from a design perspective. But I'm thinking of the following example :
> > > Imagine that we have a threshold, when exceeded, parallel a/v is used.
> > > Several a/v workers encounter tables which exceed this threshold by 1_000 and
> > > each of these workers decides to launch a few parallel workers. Another a/v
> > > worker encounters a table which is beyond this threshold by 1_000_000 and
> > > tries to launch N parallel workers, but facing the max_parallel_workers
> > > shortage. Thus, processing of this table will take a very long time to
> > > complete due to lack of resources. The only way for users to avoid it is to
> > > disable parallel a/v for all tables, which exceeds the threshold and are not
> > > of particular interest.
> >
> > I think the same thing happens even with the current design as long as
> > users misconfigure max_parallel_workers, no? Setting
> > autovacuum_max_parallel_workers to >0 would mean that users want to
> > give additional resources for autovacuums in general, I think it makes
> > sense to use parallel vacuum even for tables which exceed the
> > threshold by 1000.
> >
> > Users who want to use parallel autovacuum would have to set
> > max_parallel_workers (and max_worker_processes) high enough so that
> > each autovacuum worker can use parallel workers. If resource
> > contention occurs, it's a sign that the limits are not configured
> > properly.
> >
>
> Yeah, currently user can misconfigure max_parallel_workers, so (for example)
> multiple VACUUM PARALLEL operations running at the same time will face with
> a shortage of parallel workers. But I guess that every system has some sane
> limit for this parameter's value. If we want to ensure that all a/v leaders
> are guaranteed to launch as many parallel workers as required, we might need
> to increase the max_parallel_workers too much (and cross the sane limit).
> IMHO it may be unacceptable for many systems in production, because it will
> undermine the stability.
I understand the concern that if max_parallel_workers (and/or
max_worker_processes) value are not high enough to ensure each
autovacuum workers can launch autovacuum_max_parallel_workers, an
autovacuum on the very large table might not be able to launch the
full workers in case where some parallel workers are already being
used by others (e.g., another autovacuum on a different
slightly-smaller table etc.). But I'm not sure that the opt-out style
can handle these cases. Even if there are two huge tables and users
set parallel_vacuum_workers to both tables, there is no guarantee that
autovacuums on these tables can use the full workers, as long as
max_parallel_workers value is not enough.
>
> > The 0001 patch looks good to me. I've updated the commit message and
> > attached it. I'm going to push the patch, barring any objections.
> >
>
> Great news!
Pushed the 0001 patch.
>
> BTW, do we need to mention that this parameter can be overridden by the
> per-table setting?
IIUC the per-table setting is not actually overwriting the GUC
parameter value, but it works as an additional cap. For instance, if
autovacuum_max_parallel_workers is 2 and autovacuum_parallel_workers
is 5, we cap the parallel degree by 2, which is a similar behavior to
other parallel operations such as the parallel_workers storage
parameter. BTW it actually works in a somewhat different way than
other autovacuum-related storage parameters; the per-table parameters
overwrite GUC values. I decided to use the former behavior because
autovacuum_max_parallel_workers can work as a global switch to disable
all parallel autovacuum behavior on the system.
> > Part 3 can briefly mention that autovacuum can perform parallel vacuum
> > with parallel workers capped by autovacuum_max_parallel_workers as
> > follow:
> >
> > For tables with the <xref linkend="reloption-autovacuum-parallel-workers"/>
> > storage parameter set, an autovacuum worker can perform index vacuuming and
> > index cleanup with background workers. The number of workers launched by
> > a single autovacuum worker is limited by the
> > <xref linkend="guc-autovacuum-max-parallel-workers"/>.
>
> I suggest adding here also a description of the method for calculating the
> number of parallel workers. If so, I feel that this part of documentation will
> be completely the same as in VACUUM PARALLEL (except a few little details).
> Maybe we can create some dedicated subchapter in the "Routine vacuuming" where
> we describe how the number of parallel workers is decided. Lets call it
> something like "24.1.7 Parallel Vacuuming". Both VACUUM PARALLEL and parallel
> autovacuum can refer to this subchapter. I think it will be much easier to
> maintain. What do you think?
Describing the parallel vacuum in a new chapter in section 24.1 sounds
like a good idea.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-25 07:45 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-25 07:45 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
> > Yeah, currently user can misconfigure max_parallel_workers, so (for example)
> > multiple VACUUM PARALLEL operations running at the same time will face with
> > a shortage of parallel workers. But I guess that every system has some sane
> > limit for this parameter's value. If we want to ensure that all a/v leaders
> > are guaranteed to launch as many parallel workers as required, we might need
> > to increase the max_parallel_workers too much (and cross the sane limit).
> > IMHO it may be unacceptable for many systems in production, because it will
> > undermine the stability.
>
> I understand the concern that if max_parallel_workers (and/or
> max_worker_processes) value are not high enough to ensure each
> autovacuum workers can launch autovacuum_max_parallel_workers, an
> autovacuum on the very large table might not be able to launch the
> full workers in case where some parallel workers are already being
> used by others (e.g., another autovacuum on a different
> slightly-smaller table etc.). But I'm not sure that the opt-out style
> can handle these cases. Even if there are two huge tables and users
> set parallel_vacuum_workers to both tables, there is no guarantee that
> autovacuums on these tables can use the full workers, as long as
> max_parallel_workers value is not enough.
>
I guess you mean the "opt-in" style here?
Sure, even opt-in style doesn't give us an unbreakable guarantee that huge
tables will be processed with the desired number of parallel workers. But IMHO
"opt-in" greatly increases the probability of this. Searching for arguments in
favor of opt-in style, I asked for help from another person who has been
managing the setup of highload systems for decades. He promised to share his
opinion next week.
> >
> > BTW, do we need to mention that this parameter can be overridden by the
> > per-table setting?
>
> IIUC the per-table setting is not actually overwriting the GUC
> parameter value, but it works as an additional cap. For instance, if
> autovacuum_max_parallel_workers is 2 and autovacuum_parallel_workers
> is 5, we cap the parallel degree by 2, which is a similar behavior to
> other parallel operations such as the parallel_workers storage
> parameter. BTW it actually works in a somewhat different way than
> other autovacuum-related storage parameters; the per-table parameters
> overwrite GUC values. I decided to use the former behavior because
> autovacuum_max_parallel_workers can work as a global switch to disable
> all parallel autovacuum behavior on the system.
>
Yep, you are right. I am misworded. Let me reformulate my question :
Do we need to mention that this parameter can be capped by the per-table
setting?
>
> > > Part 3 can briefly mention that autovacuum can perform parallel vacuum
> > > with parallel workers capped by autovacuum_max_parallel_workers as
> > > follow:
> > >
> > > For tables with the <xref linkend="reloption-autovacuum-parallel-workers"/>
> > > storage parameter set, an autovacuum worker can perform index vacuuming and
> > > index cleanup with background workers. The number of workers launched by
> > > a single autovacuum worker is limited by the
> > > <xref linkend="guc-autovacuum-max-parallel-workers"/>.
> >
> > I suggest adding here also a description of the method for calculating the
> > number of parallel workers. If so, I feel that this part of documentation will
> > be completely the same as in VACUUM PARALLEL (except a few little details).
> > Maybe we can create some dedicated subchapter in the "Routine vacuuming" where
> > we describe how the number of parallel workers is decided. Lets call it
> > something like "24.1.7 Parallel Vacuuming". Both VACUUM PARALLEL and parallel
> > autovacuum can refer to this subchapter. I think it will be much easier to
> > maintain. What do you think?
>
> Describing the parallel vacuum in a new chapter in section 24.1 sounds
> like a good idea.
OK, then I'll do it.
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-25 22:42 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-25 22:42 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed, Mar 25, 2026 at 12:45 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> > > Yeah, currently user can misconfigure max_parallel_workers, so (for example)
> > > multiple VACUUM PARALLEL operations running at the same time will face with
> > > a shortage of parallel workers. But I guess that every system has some sane
> > > limit for this parameter's value. If we want to ensure that all a/v leaders
> > > are guaranteed to launch as many parallel workers as required, we might need
> > > to increase the max_parallel_workers too much (and cross the sane limit).
> > > IMHO it may be unacceptable for many systems in production, because it will
> > > undermine the stability.
> >
> > I understand the concern that if max_parallel_workers (and/or
> > max_worker_processes) value are not high enough to ensure each
> > autovacuum workers can launch autovacuum_max_parallel_workers, an
> > autovacuum on the very large table might not be able to launch the
> > full workers in case where some parallel workers are already being
> > used by others (e.g., another autovacuum on a different
> > slightly-smaller table etc.). But I'm not sure that the opt-out style
> > can handle these cases. Even if there are two huge tables and users
> > set parallel_vacuum_workers to both tables, there is no guarantee that
> > autovacuums on these tables can use the full workers, as long as
> > max_parallel_workers value is not enough.
> >
>
> I guess you mean the "opt-in" style here?
Oops, yes. I wanted it to mean "opt-in" style.
>
> Sure, even opt-in style doesn't give us an unbreakable guarantee that huge
> tables will be processed with the desired number of parallel workers. But IMHO
> "opt-in" greatly increases the probability of this.
Cost-based vacuum delay parameters shared between the autovacuum
leader and its parallel workers.
> Searching for arguments in
> favor of opt-in style, I asked for help from another person who has been
> managing the setup of highload systems for decades. He promised to share his
> opinion next week.
Given that we have one and half weeks before the feature freeze, I
think it's better to complete the project first before waiting for
his/her comments next week. Even if we finish this feature with the
opt-out style, we can hear more opinions on it and change the default
behavior as the change would be privial. What do you think?
I've squashed all patches except for the documentation patch as I
assume you're working on it. The attached fixup patch contains several
changes: using opt-out style, comment improvements, and fixing typos
etc.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-patch] v31-0002-fixup-several-changes.patch (13.5K, 2-v31-0002-fixup-several-changes.patch)
download | inline diff:
From 98e63807d9dbbf2d6153ce4b8139a49f84339a07 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Wed, 25 Mar 2026 14:49:12 -0700
Subject: [PATCH v31 2/2] fixup: several changes.
- use opt-out style.
- adjust default values.
- improve comments.
- fixes typos
etc.
---
src/backend/access/common/reloptions.c | 2 +-
src/backend/access/heap/vacuumlazy.c | 5 +-
src/backend/commands/vacuumparallel.c | 52 +++++++++++++------
src/backend/postmaster/autovacuum.c | 36 +++++++------
src/backend/utils/init/globals.c | 2 +-
src/backend/utils/misc/guc.c | 7 +--
src/backend/utils/misc/guc_parameters.dat | 2 +-
src/backend/utils/misc/postgresql.conf.sample | 2 +-
.../t/001_parallel_autovacuum.pl | 22 ++++----
9 files changed, 82 insertions(+), 48 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ce41b015b32..cee705500f8 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -243,7 +243,7 @@ static relopt_int intRelOpts[] =
RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock
},
- 0, -1, 1024
+ -1, -1, 1024
},
{
{
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8c7de657976..9fd4f6febbe 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -864,8 +864,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
dead_items_alloc(vacrel, params.nworkers);
#ifdef USE_INJECTION_POINTS
+
/*
- * Trigger injection point, if parallel autovacuum is about to be started.
+ * Used by tests to pause before parallel vacuum is launched, allowing
+ * test code to modify configuration that the leader then propagates to
+ * workers.
*/
if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 62b6f50b538..13544de5b93 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -69,10 +69,12 @@ typedef struct PVSharedCostParams
{
/*
* The generation counter is incremented by the leader process each time
- * it updates the shared cost-based vacuum delay parameters. Paralell
+ * it updates the shared cost-based vacuum delay parameters. Parallel
* vacuum workers compares it with their local generation,
* shared_params_generation_local, to detect whether they need to refresh
- * their local parameters.
+ * their local parameters. The generation starts from 1 so that a freshly
+ * started worker (whose local copy is 0) will always load the initial
+ * parameters on its first check.
*/
pg_atomic_uint32 generation;
@@ -158,13 +160,13 @@ typedef struct PVShared
/*
* If 'true' then we are running parallel autovacuum. Otherwise, we are
- * running parallel maintenence VACUUM.
+ * running parallel maintenance VACUUM.
*/
bool is_autovacuum;
/*
- * Struct for syncing cost-based vacuum delay parameters between
- * supportive parallel autovacuum workers with leader worker.
+ * Cost-based vacuum delay parameters shared between the autovacuum leader
+ * and its parallel workers.
*/
PVSharedCostParams cost_params;
} PVShared;
@@ -271,7 +273,13 @@ struct ParallelVacuumState
static PVSharedCostParams *pv_shared_cost_params = NULL;
-/* See comments in the PVSharedCostParams for the details */
+/*
+ * Worker-local copy of the last cost-parameter generation this worker has
+ * applied. Initialized to 0; since the leader initializes the shared
+ * generation counter to 1, the first call to
+ * parallel_vacuum_update_shared_delay_params() will always detect a
+ * mismatch and read the initial parameters from shared memory.
+ */
static uint32 shared_params_generation_local = 0;
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
@@ -455,7 +463,7 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
if (shared->is_autovacuum)
{
parallel_vacuum_set_cost_parameters(&shared->cost_params);
- pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ pg_atomic_init_u32(&shared->cost_params.generation, 1);
SpinLockInit(&shared->cost_params.mutex);
pv_shared_cost_params = &(shared->cost_params);
@@ -623,7 +631,7 @@ parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
* Updates the cost-based vacuum delay parameters for parallel autovacuum
* workers.
*
- * For non-autovacuum parallel worker this function will have no effect.
+ * For non-autovacuum parallel workers, this function will have no effect.
*/
void
parallel_vacuum_update_shared_delay_params(void)
@@ -632,7 +640,7 @@ parallel_vacuum_update_shared_delay_params(void)
Assert(IsParallelWorker());
- /* Quick return if the wokrer is not running for the autovacuum */
+ /* Quick return if the worker is not running for the autovacuum */
if (pv_shared_cost_params == NULL)
return;
@@ -681,7 +689,7 @@ parallel_vacuum_propagate_shared_delay_params(void)
return;
/*
- * Check if any delay parameters has changed. We can read them without
+ * Check if any delay parameters have changed. We can read them without
* locks as only the leader can modify them.
*/
if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
@@ -905,9 +913,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
#ifdef USE_INJECTION_POINTS
+
/*
- * This injection point is used to wait until parallel autovacuum workers
- * finishes their part of index processing.
+ * Used by tests to pause after workers are launched but before index
+ * vacuuming begins.
*/
if (nworkers > 0)
INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
@@ -1247,15 +1256,26 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
shared->dead_items_handle);
/* Set cost-based vacuum delay */
- VacuumUpdateCosts();
+ if (shared->is_autovacuum)
+ {
+ /*
+ * Parallel autovacuum workers initialize cost-based delay parameters
+ * from the leader's shared state rather than GUC defaults, because
+ * the leader may have applied per-table or autovacuum-specific
+ * overrides. pv_shared_cost_params must be set before calling
+ * parallel_vacuum_update_shared_delay_params().
+ */
+ pv_shared_cost_params = &(shared->cost_params);
+ parallel_vacuum_update_shared_delay_params();
+ }
+ else
+ VacuumUpdateCosts();
+
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
- if (shared->is_autovacuum)
- pv_shared_cost_params = &(shared->cost_params);
-
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 562514e2ece..ce893db1ab5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2797,7 +2797,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
int multixact_freeze_table_age;
int log_vacuum_min_duration;
int log_analyze_min_duration;
- int nparallel_workers = -1; /* disabled by default */
/*
* Calculate the vacuum cost parameters and the freeze ages. If there
@@ -2858,19 +2857,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* Decide whether we need to process indexes of table in parallel. */
- if (avopts)
- {
- if (avopts->autovacuum_parallel_workers > 0)
- nparallel_workers = avopts->autovacuum_parallel_workers;
- else if (avopts->autovacuum_parallel_workers == -1)
- {
- nparallel_workers = autovacuum_max_parallel_workers > 0
- ? autovacuum_max_parallel_workers
- : -1; /* disable parallelism if parameter's value is 0 */
- }
- }
-
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2879,7 +2865,27 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_vacuum_min_duration = log_vacuum_min_duration;
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
- tab->at_params.nworkers = nparallel_workers;
+
+ /* Determine the number of parallel vacuum workers to use */
+ tab->at_params.nworkers = 0;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers == 0)
+ {
+ /*
+ * Disable parallel vacuum, if the reloption sets the parallel
+ * degree as zero.
+ */
+ tab->at_params.nworkers = -1;
+ }
+ else if (avopts->autovacuum_parallel_workers > 0)
+ tab->at_params.nworkers = avopts->autovacuum_parallel_workers;
+
+ /*
+ * autovacuum_parallel_workers == -1 falls through, keep
+ * nworkers=0
+ */
+ }
/*
* Later, in vacuum_rel(), we check reloptions for any
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 8265a82b639..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,7 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
-int autovacuum_max_parallel_workers = 2;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 45b39b7c47f..1ac8e8fc3be 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3359,9 +3359,10 @@ set_config_with_handle(const char *name, config_handle *handle,
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
* Other changes might need to affect other workers, so forbid them. Note,
- * that parallel autovacuum leader is an exception, because only
- * cost-based delays need to be affected also to parallel autovacuum
- * workers, and we will handle it elsewhere if appropriate.
+ * that parallel autovacuum leader is an exception because only cost-based
+ * delays need to be affected also to parallel autovacuum workers. These
+ * parameters are propagated to its workers during parallel vacuum (see
+ * vacuumparallel.c for details).
*/
if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
action != GUC_ACTION_SAVE &&
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 3d2fd35a004..275198f2023 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -157,7 +157,7 @@
{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
variable => 'autovacuum_max_parallel_workers',
- boot_val => '2',
+ boot_val => '0',
min => '0',
max => 'MAX_PARALLEL_WORKER_LIMIT',
},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 11d96f4dd4f..9853df0bdf7 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,7 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
-#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
+#autovacuum_max_parallel_workers = 0 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index 0364019d5f0..2aca32374a2 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -12,7 +12,7 @@ if ($ENV{enable_injection_points} ne 'yes')
# Before each test we should disable autovacuum for 'test_autovac' table and
# generate some dead tuples in it. Returns the current autovacuum_count of
-# the table tset_autovac.
+# the table test_autovac.
sub prepare_for_next_test
{
my ($node, $test_number) = @_;
@@ -47,16 +47,20 @@ my $psql_out;
my $node = PostgreSQL::Test::Cluster->new('main');
$node->init;
-# Configure postgres, so it can launch parallel autovacuum workers, log all
-# information we are interested in and autovacuum works frequently
+# Limit to one autovacuum worker and disable autovacuum logging globally
+# (enabled only on the test table) so that log checks below match only
+# activity on the expected table.
$node->append_conf(
'postgresql.conf', qq{
- max_worker_processes = 20
- max_parallel_workers = 20
- autovacuum_max_parallel_workers = 4
- log_min_messages = debug2
- autovacuum_naptime = '1s'
- min_parallel_index_scan_size = 0
+autovacuum_max_workers = 1
+autovacuum_worker_slots = 1
+autovacuum_max_parallel_workers = 2
+max_worker_processes = 10
+max_parallel_workers = 10
+log_min_messages = debug2
+autovacuum_naptime = '1s'
+min_parallel_index_scan_size = 0
+log_autovacuum_min_duration = -1
});
$node->start;
--
2.53.0
[text/x-patch] v31-0001-Parallel-autovacuum.patch (31.3K, 3-v31-0001-Parallel-autovacuum.patch)
download | inline diff:
From 493070daf550b5b7931d21d4b5661ec3b466f51b Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v31 1/2] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 +
src/backend/access/heap/vacuumlazy.c | 9 +
src/backend/commands/vacuum.c | 21 +-
src/backend/commands/vacuumparallel.c | 201 +++++++++++++++++-
src/backend/postmaster/autovacuum.c | 20 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/commands/vacuum.h | 2 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 2 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 191 +++++++++++++++++
src/tools/pgindent/typedefs.list | 1 +
20 files changed, 504 insertions(+), 13 deletions(-)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index a6002ae9b07..ce41b015b32 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -236,6 +236,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Overrides value of the autovacuum_max_parallel_workers parameter for this table, if > -1.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ 0, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1969,6 +1978,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f698c2d899b..8c7de657976 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -862,6 +863,14 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+ /*
+ * Trigger injection point, if parallel autovacuum is about to be started.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..62b6f50b538 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -16,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -37,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -51,6 +61,31 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Paralell
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -120,6 +155,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Struct for syncing cost-based vacuum delay parameters between
+ * supportive parallel autovacuum workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -222,6 +269,11 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/* See comments in the PVSharedCostParams for the details */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -233,6 +285,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -374,8 +427,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -392,6 +446,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -457,6 +526,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -534,6 +606,103 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel worker this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the wokrer is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters has changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -555,12 +724,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +773,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -730,6 +904,15 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+ /*
+ * This injection point is used to wait until parallel autovacuum workers
+ * finishes their part of index processing.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1070,6 +1253,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
@@ -1119,6 +1305,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 7ecb069c248..562514e2ece 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1658,7 +1658,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
@@ -2797,6 +2797,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
int multixact_freeze_table_age;
int log_vacuum_min_duration;
int log_analyze_min_duration;
+ int nparallel_workers = -1; /* disabled by default */
/*
* Calculate the vacuum cost parameters and the freeze ages. If there
@@ -2856,8 +2857,20 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers > 0)
+ nparallel_workers = avopts->autovacuum_parallel_workers;
+ else if (avopts->autovacuum_parallel_workers == -1)
+ {
+ nparallel_workers = autovacuum_max_parallel_workers > 0
+ ? autovacuum_max_parallel_workers
+ : -1; /* disable parallelism if parameter's value is 0 */
+ }
+ }
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2866,6 +2879,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_vacuum_min_duration = log_vacuum_min_duration;
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ tab->at_params.nworkers = nparallel_workers;
/*
* Later, in vacuum_rel(), we check reloptions for any
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..8265a82b639 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 2;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e1546d9c97a..45b39b7c47f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3358,9 +3358,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel autovacuum
+ * workers, and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0c9854ad8fc..3d2fd35a004 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,14 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e4abe6c0077..11d96f4dd4f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 523d3f39fc5..f6bf072bab5 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f45bca015c..8b42808e70b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -422,6 +422,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ int autovacuum_parallel_workers;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..336a212faf4 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..929659956cb 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..0364019d5f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,191 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table tset_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql(
+ 'postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until(
+ 'postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf(
+ 'postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ autovacuum_max_parallel_workers = 4
+ log_min_messages = debug2
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql(
+ 'postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing');
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index dbbec84b222..e3b1cba5289 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2094,6 +2094,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.53.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-27 03:54 Bharath Rupireddy <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Bharath Rupireddy @ 2026-03-27 03:54 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Wed, Mar 25, 2026 at 3:43 PM Masahiko Sawada <[email protected]> wrote:
>
> Given that we have one and half weeks before the feature freeze, I
> think it's better to complete the project first before waiting for
> his/her comments next week. Even if we finish this feature with the
> opt-out style, we can hear more opinions on it and change the default
> behavior as the change would be privial. What do you think?
>
> I've squashed all patches except for the documentation patch as I
> assume you're working on it. The attached fixup patch contains several
> changes: using opt-out style, comment improvements, and fixing typos
> etc.
+1 for enabling this feature by default. When enough CPU is available,
vacuuming multiple indexes of a table in parallel in autovacuum
definitely speeds things up. This way we will also get field
experience using this feature.
Thank you for sending the latest patches. I quickly reviewed the v31
patches. Here are some comments.
1/ + {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
I haven't looked at the whole thread, but do we all think we need this
as a relopt? IMHO, we can wait for field experience and introduce this
later. I'm having a hard time finding a use-case where one wants to
disable the indexes at the table level. If there was already an
agreement, I agree to commit to that decision.
2/ + /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool is_autovacuum;
+
The variable name looks a bit confusing. How about we rely on
AmAutoVacuumWorkerProcess() and avoid the bool in shared memory?
--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-28 11:10 Daniil Davydov <[email protected]>
parent: Bharath Rupireddy <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-28 11:10 UTC (permalink / raw)
To: Bharath Rupireddy <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Thu, Mar 26, 2026 at 5:43 AM Masahiko Sawada <[email protected]> wrote:
>
> On Wed, Mar 25, 2026 at 12:45 AM Daniil Davydov <[email protected]> wrote:
> >
> > Searching for arguments in
> > favor of opt-in style, I asked for help from another person who has been
> > managing the setup of highload systems for decades. He promised to share his
> > opinion next week.
>
> Given that we have one and half weeks before the feature freeze, I
> think it's better to complete the project first before waiting for
> his/her comments next week. Even if we finish this feature with the
> opt-out style, we can hear more opinions on it and change the default
> behavior as the change would be privial. What do you think?
>
Sure, if we can change the default value after the feature freeze, I don't
mind leaving our parameter in opt-out style by now.
> I've squashed all patches except for the documentation patch as I
> assume you're working on it. The attached fixup patch contains several
> changes: using opt-out style, comment improvements, and fixing typos
> etc.
>
Thank you very much for the proposed fixes!
I like the way you have changed nparallel_workers calculation (autovacuum.c).
Forcing parallel workers to always read shared cost params at the first time
is a good decision. All comments changes are also LGTM.
The only place that I have changed is reloptions.c :
As you have explained, it is not appropriate to use the "overrides" wording
in the reloption's description, so I decided to return an old one.
On Fri, Mar 27, 2026 at 10:54 AM Bharath Rupireddy
<[email protected]> wrote:
>
> Hi,
>
> On Wed, Mar 25, 2026 at 3:43 PM Masahiko Sawada <[email protected]> wrote:
> >
> > Given that we have one and half weeks before the feature freeze, I
> > think it's better to complete the project first before waiting for
> > his/her comments next week. Even if we finish this feature with the
> > opt-out style, we can hear more opinions on it and change the default
> > behavior as the change would be privial. What do you think?
> >
> > I've squashed all patches except for the documentation patch as I
> > assume you're working on it. The attached fixup patch contains several
> > changes: using opt-out style, comment improvements, and fixing typos
> > etc.
>
> +1 for enabling this feature by default. When enough CPU is available,
> vacuuming multiple indexes of a table in parallel in autovacuum
> definitely speeds things up.
Yes, for sure. But I have concerns that enabling parallel a/v for everyone
will cause the parallel workers shortage during processing of the most huge
tables.
> Thank you for sending the latest patches. I quickly reviewed the v31
> patches. Here are some comments.
>
> 1/ + {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
>
> I haven't looked at the whole thread, but do we all think we need this
> as a relopt? IMHO, we can wait for field experience and introduce this
> later.
I think that we should leave both reloption and the config parameter.
Getting rid from the reloption will greatly reduce the ability of users to
tune this feature. I'm afraid that this may lead to people not using parallel
autovacuum.
> I'm having a hard time finding a use-case where one wants to
> disable the indexes at the table level. If there was already an
> agreement, I agree to commit to that decision.
You can read discussion from [1] to the current message in order to dive into
the question.
To make the long story short, I think that the most common use case for this
feature is allowing parallel a/v for 2-3 tables, each of which has ~100
indexes. The rest of the tables do not require parallel processing (at least
it's a much lower priority for them).
At the same time, Masahiko-san thinks that only the system should decide which
tables will be processed in parallel. System's decision should be based on the
number of indexes and a few other config parameters (e.g.
min_parallel_index_scan_size). Thus, possibly many tables will be able to be
processed in parallel.
(Both opinions are pretty simplified).
>
> 2/ + /*
> + * If 'true' then we are running parallel autovacuum. Otherwise, we are
> + * running parallel maintenence VACUUM.
> + */
> + bool is_autovacuum;
> +
>
> The variable name looks a bit confusing. How about we rely on
> AmAutoVacuumWorkerProcess() and avoid the bool in shared memory?
This variable is needed for parallel workers, which are taken from the
bgworkers pool. I.e. AmAutovacuumWorker() will return 'false' for them.
We need the "is_autovacuum" variable in order to understand exactly what this
process was started for (VACUUM PARALLEL or parallel autovacuum).
Thanks everyone for the review!
Please, see an updated set of patches :
As I promised, I created a dedicated chapter for Parallel Vacuum description.
Both maintenance VACUUM and autovacuum now refer to this chapter.
I am pretty inexperienced in the documentation writing, so forgive me if
something is out of code style.
[1] https://www.postgresql.org/message-id/CAJDiXggH1bW%3D4n%2B55CGLvs_sRU4SYNXwYLZ37wvJ5H_3yURSPw%40mail...
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v32-0002-Documantation-for-parallel-autovacuum.patch (8.6K, 2-v32-0002-Documantation-for-parallel-autovacuum.patch)
download | inline diff:
From d13ccaa55862efaad2abda73cf6870dde8ca24d1 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 28 Mar 2026 17:37:07 +0700
Subject: [PATCH v32 2/2] Documantation for parallel autovacuum
---
doc/src/sgml/config.sgml | 19 +++++++++++++++
doc/src/sgml/maintenance.sgml | 38 ++++++++++++++++++++++++++++++
doc/src/sgml/ref/create_table.sgml | 18 ++++++++++++++
doc/src/sgml/ref/vacuum.sgml | 24 ++++---------------
4 files changed, 79 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 229f41353eb..cbe73172b2e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9485,6 +9486,24 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for the <xref linkend="parallel-vacuum"/>
+ by a single autovacuum worker. Is capped by <xref linkend="guc-max-parallel-workers"/>.
+ The default is 0, which means no parallel index vacuuming for this
+ table.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0d2a28207ed..a1b851ea58f 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -927,6 +927,13 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to perform the
+ <xref linkend="parallel-vacuum"/>.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
@@ -1166,6 +1173,37 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
</sect3>
</sect2>
+
+ <sect2 id="parallel-vacuum" xreflabel="Parallel Vacuum">
+ <title>Parallel Vacuum</title>
+
+ <para>
+ Both <command>VACUUM</command> command and autovacuum daemon can perform
+ index vacuum and index cleanup phases in parallel using
+ <replaceable class="parameter">integer</replaceable> background workers
+ (for the details of each vacuum phase, please refer to
+ <xref linkend="vacuum-phases"/>). The number of workers used to perform
+ the operation is equal to the number of indexes on the relation that
+ support parallel vacuum which may be further limited by the value specific
+ to <command>VACUUM</command> or autovacuum (see <literal>PARALLEL</literal>
+ option for <xref linkend="sql-vacuum"/> or
+ <xref linkend="reloption-autovacuum-parallel-workers"/>, respectively).
+ </para>
+
+ <para>
+ An index can participate in parallel vacuum if and only if the size of the
+ index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+ Please note that it is not guaranteed that the number of parallel workers
+ specified in <replaceable class="parameter">integer</replaceable> will be
+ used during execution. It is possible for a vacuum to run with fewer
+ workers than specified, or even with no workers at all. Only one worker
+ can be used per index. So parallel workers are launched only when there
+ are at least <literal>2</literal> indexes in the table. Workers for
+ vacuum are launched before the start of each phase and exit at the end of
+ the phase. These behaviors might change in a future release.
+ </para>
+
+ </sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 80829b23945..8f5b25411cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1738,6 +1738,24 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can be used
+ for the <xref linkend="parallel-vacuum"/>.
+ The default value is 0, which means no parallel index vacuuming for
+ this table. If value is -1 then parallel degree will computed based on
+ number of indexes and limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index ac5d083d468..d3f32851386 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -81,7 +81,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
is not obtained. However, extra space is not returned to the operating
system (in most cases); it's just kept available for re-use within the
same table. It also allows us to leverage multiple CPUs in order to process
- indexes. This feature is known as <firstterm>parallel vacuum</firstterm>.
+ indexes. This feature is known as <xref linkend="parallel-vacuum"/>.
To disable this feature, one can use <literal>PARALLEL</literal> option and
specify parallel workers as zero. <command>VACUUM FULL</command> rewrites
the entire contents of the table into a new disk file with no extra space,
@@ -266,25 +266,9 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<term><literal>PARALLEL</literal></term>
<listitem>
<para>
- Perform index vacuum and index cleanup phases of <command>VACUUM</command>
- in parallel using <replaceable class="parameter">integer</replaceable>
- background workers (for the details of each vacuum phase, please
- refer to <xref linkend="vacuum-phases"/>). The number of workers used
- to perform the operation is equal to the number of indexes on the
- relation that support parallel vacuum which is limited by the number of
- workers specified with <literal>PARALLEL</literal> option if any which is
- further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
- An index can participate in parallel vacuum if and only if the size of the
- index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
- Please note that it is not guaranteed that the number of parallel workers
- specified in <replaceable class="parameter">integer</replaceable> will be
- used during execution. It is possible for a vacuum to run with fewer
- workers than specified, or even with no workers at all. Only one worker
- can be used per index. So parallel workers are launched only when there
- are at least <literal>2</literal> indexes in the table. Workers for
- vacuum are launched before the start of each phase and exit at the end of
- the phase. These behaviors might change in a future release. This
- option can't be used with the <literal>FULL</literal> option.
+ Limits the number of parallel workers used to perform the
+ <xref linkend="parallel-vacuum"/>, which is further limited by
+ <xref linkend="guc-max-parallel-maintenance-workers"/>.
</para>
</listitem>
</varlistentry>
--
2.43.0
[text/x-patch] v32-0001-Parallel-autovacuum.patch (32.0K, 3-v32-0001-Parallel-autovacuum.patch)
download | inline diff:
From 5dcd6b42a0d1aae8f3f772b57f5568545e386ad6 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v32 1/2] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 +
src/backend/access/heap/vacuumlazy.c | 12 +
src/backend/commands/vacuum.c | 21 +-
src/backend/commands/vacuumparallel.c | 223 +++++++++++++++++-
src/backend/postmaster/autovacuum.c | 26 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 9 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/commands/vacuum.h | 2 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 2 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 195 +++++++++++++++
src/tools/pgindent/typedefs.list | 1 +
20 files changed, 539 insertions(+), 14 deletions(-)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index a6002ae9b07..56dc99bdbe1 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -236,6 +236,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1969,6 +1978,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f698c2d899b..9fd4f6febbe 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -862,6 +863,17 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause before parallel vacuum is launched, allowing
+ * test code to modify configuration that the leader then propagates to
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..13544de5b93 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -16,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -37,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -51,6 +61,33 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Parallel
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters. The generation starts from 1 so that a freshly
+ * started worker (whose local copy is 0) will always load the initial
+ * parameters on its first check.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -120,6 +157,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenance VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Cost-based vacuum delay parameters shared between the autovacuum leader
+ * and its parallel workers.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -222,6 +271,17 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/*
+ * Worker-local copy of the last cost-parameter generation this worker has
+ * applied. Initialized to 0; since the leader initializes the shared
+ * generation counter to 1, the first call to
+ * parallel_vacuum_update_shared_delay_params() will always detect a
+ * mismatch and read the initial parameters from shared memory.
+ */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -233,6 +293,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -374,8 +435,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -392,6 +454,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 1);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -457,6 +534,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -534,6 +614,103 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel workers, this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the worker is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters have changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -555,12 +732,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +781,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -730,6 +912,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause after workers are launched but before index
+ * vacuuming begins.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1064,7 +1256,21 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
shared->dead_items_handle);
/* Set cost-based vacuum delay */
- VacuumUpdateCosts();
+ if (shared->is_autovacuum)
+ {
+ /*
+ * Parallel autovacuum workers initialize cost-based delay parameters
+ * from the leader's shared state rather than GUC defaults, because
+ * the leader may have applied per-table or autovacuum-specific
+ * overrides. pv_shared_cost_params must be set before calling
+ * parallel_vacuum_update_shared_delay_params().
+ */
+ pv_shared_cost_params = &(shared->cost_params);
+ parallel_vacuum_update_shared_delay_params();
+ }
+ else
+ VacuumUpdateCosts();
+
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(shared->cost_balance);
@@ -1119,6 +1325,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d695f1de4bd..ced4970a3a8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1688,7 +1688,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
@@ -2928,8 +2928,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2939,6 +2938,27 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ /* Determine the number of parallel vacuum workers to use */
+ tab->at_params.nworkers = 0;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers == 0)
+ {
+ /*
+ * Disable parallel vacuum, if the reloption sets the parallel
+ * degree as zero.
+ */
+ tab->at_params.nworkers = -1;
+ }
+ else if (avopts->autovacuum_parallel_workers > 0)
+ tab->at_params.nworkers = avopts->autovacuum_parallel_workers;
+
+ /*
+ * autovacuum_parallel_workers == -1 falls through, keep
+ * nworkers=0
+ */
+ }
+
/*
* Later, in vacuum_rel(), we check reloptions for any
* vacuum_max_eager_freeze_failure_rate override.
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e1546d9c97a..1ac8e8fc3be 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3358,9 +3358,14 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception because only cost-based
+ * delays need to be affected also to parallel autovacuum workers. These
+ * parameters are propagated to its workers during parallel vacuum (see
+ * vacuumparallel.c for details).
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0a862693fcd..6ef46d88155 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -170,6 +170,14 @@
max => '10.0',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cf15597385b..73c49f09aef 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 523d3f39fc5..f6bf072bab5 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f45bca015c..8b42808e70b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -422,6 +422,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ int autovacuum_parallel_workers;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..336a212faf4 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..929659956cb 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..2aca32374a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,195 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table test_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql(
+ 'postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until(
+ 'postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Limit to one autovacuum worker and disable autovacuum logging globally
+# (enabled only on the test table) so that log checks below match only
+# activity on the expected table.
+$node->append_conf(
+ 'postgresql.conf', qq{
+autovacuum_max_workers = 1
+autovacuum_worker_slots = 1
+autovacuum_max_parallel_workers = 2
+max_worker_processes = 10
+max_parallel_workers = 10
+log_min_messages = debug2
+autovacuum_naptime = '1s'
+min_parallel_index_scan_size = 0
+log_autovacuum_min_duration = -1
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql(
+ 'postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing');
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e3c1007abdf..aeaf9307558 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2097,6 +2097,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.43.0
[text/x-patch] v31--v32-dif-for-0001.patch (13.7K, 4-v31--v32-dif-for-0001.patch)
download | inline diff:
From 3954e445a3e794ef19dbccb699a0e61d36a33733 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 28 Mar 2026 15:32:01 +0700
Subject: [PATCH 2/2] fixes for 0001 patch
---
src/backend/access/common/reloptions.c | 4 +-
src/backend/access/heap/vacuumlazy.c | 5 +-
src/backend/commands/vacuumparallel.c | 52 +++++++++++++------
src/backend/postmaster/autovacuum.c | 36 +++++++------
src/backend/utils/init/globals.c | 2 +-
src/backend/utils/misc/guc.c | 7 +--
src/backend/utils/misc/guc_parameters.dat | 2 +-
src/backend/utils/misc/postgresql.conf.sample | 2 +-
.../t/001_parallel_autovacuum.pl | 22 ++++----
9 files changed, 83 insertions(+), 49 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ce41b015b32..56dc99bdbe1 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -239,11 +239,11 @@ static relopt_int intRelOpts[] =
{
{
"autovacuum_parallel_workers",
- "Overrides value of the autovacuum_max_parallel_workers parameter for this table, if > -1.",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock
},
- 0, -1, 1024
+ -1, -1, 1024
},
{
{
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8c7de657976..9fd4f6febbe 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -864,8 +864,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
dead_items_alloc(vacrel, params.nworkers);
#ifdef USE_INJECTION_POINTS
+
/*
- * Trigger injection point, if parallel autovacuum is about to be started.
+ * Used by tests to pause before parallel vacuum is launched, allowing
+ * test code to modify configuration that the leader then propagates to
+ * workers.
*/
if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 62b6f50b538..13544de5b93 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -69,10 +69,12 @@ typedef struct PVSharedCostParams
{
/*
* The generation counter is incremented by the leader process each time
- * it updates the shared cost-based vacuum delay parameters. Paralell
+ * it updates the shared cost-based vacuum delay parameters. Parallel
* vacuum workers compares it with their local generation,
* shared_params_generation_local, to detect whether they need to refresh
- * their local parameters.
+ * their local parameters. The generation starts from 1 so that a freshly
+ * started worker (whose local copy is 0) will always load the initial
+ * parameters on its first check.
*/
pg_atomic_uint32 generation;
@@ -158,13 +160,13 @@ typedef struct PVShared
/*
* If 'true' then we are running parallel autovacuum. Otherwise, we are
- * running parallel maintenence VACUUM.
+ * running parallel maintenance VACUUM.
*/
bool is_autovacuum;
/*
- * Struct for syncing cost-based vacuum delay parameters between
- * supportive parallel autovacuum workers with leader worker.
+ * Cost-based vacuum delay parameters shared between the autovacuum leader
+ * and its parallel workers.
*/
PVSharedCostParams cost_params;
} PVShared;
@@ -271,7 +273,13 @@ struct ParallelVacuumState
static PVSharedCostParams *pv_shared_cost_params = NULL;
-/* See comments in the PVSharedCostParams for the details */
+/*
+ * Worker-local copy of the last cost-parameter generation this worker has
+ * applied. Initialized to 0; since the leader initializes the shared
+ * generation counter to 1, the first call to
+ * parallel_vacuum_update_shared_delay_params() will always detect a
+ * mismatch and read the initial parameters from shared memory.
+ */
static uint32 shared_params_generation_local = 0;
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
@@ -455,7 +463,7 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
if (shared->is_autovacuum)
{
parallel_vacuum_set_cost_parameters(&shared->cost_params);
- pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ pg_atomic_init_u32(&shared->cost_params.generation, 1);
SpinLockInit(&shared->cost_params.mutex);
pv_shared_cost_params = &(shared->cost_params);
@@ -623,7 +631,7 @@ parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
* Updates the cost-based vacuum delay parameters for parallel autovacuum
* workers.
*
- * For non-autovacuum parallel worker this function will have no effect.
+ * For non-autovacuum parallel workers, this function will have no effect.
*/
void
parallel_vacuum_update_shared_delay_params(void)
@@ -632,7 +640,7 @@ parallel_vacuum_update_shared_delay_params(void)
Assert(IsParallelWorker());
- /* Quick return if the wokrer is not running for the autovacuum */
+ /* Quick return if the worker is not running for the autovacuum */
if (pv_shared_cost_params == NULL)
return;
@@ -681,7 +689,7 @@ parallel_vacuum_propagate_shared_delay_params(void)
return;
/*
- * Check if any delay parameters has changed. We can read them without
+ * Check if any delay parameters have changed. We can read them without
* locks as only the leader can modify them.
*/
if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
@@ -905,9 +913,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
#ifdef USE_INJECTION_POINTS
+
/*
- * This injection point is used to wait until parallel autovacuum workers
- * finishes their part of index processing.
+ * Used by tests to pause after workers are launched but before index
+ * vacuuming begins.
*/
if (nworkers > 0)
INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
@@ -1247,15 +1256,26 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
shared->dead_items_handle);
/* Set cost-based vacuum delay */
- VacuumUpdateCosts();
+ if (shared->is_autovacuum)
+ {
+ /*
+ * Parallel autovacuum workers initialize cost-based delay parameters
+ * from the leader's shared state rather than GUC defaults, because
+ * the leader may have applied per-table or autovacuum-specific
+ * overrides. pv_shared_cost_params must be set before calling
+ * parallel_vacuum_update_shared_delay_params().
+ */
+ pv_shared_cost_params = &(shared->cost_params);
+ parallel_vacuum_update_shared_delay_params();
+ }
+ else
+ VacuumUpdateCosts();
+
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
- if (shared->is_autovacuum)
- pv_shared_cost_params = &(shared->cost_params);
-
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3124341721c..ced4970a3a8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2869,7 +2869,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
int multixact_freeze_table_age;
int log_vacuum_min_duration;
int log_analyze_min_duration;
- int nparallel_workers = -1; /* disabled by default */
/*
* Calculate the vacuum cost parameters and the freeze ages. If there
@@ -2930,19 +2929,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* Decide whether we need to process indexes of table in parallel. */
- if (avopts)
- {
- if (avopts->autovacuum_parallel_workers > 0)
- nparallel_workers = avopts->autovacuum_parallel_workers;
- else if (avopts->autovacuum_parallel_workers == -1)
- {
- nparallel_workers = autovacuum_max_parallel_workers > 0
- ? autovacuum_max_parallel_workers
- : -1; /* disable parallelism if parameter's value is 0 */
- }
- }
-
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2951,7 +2937,27 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_vacuum_min_duration = log_vacuum_min_duration;
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
- tab->at_params.nworkers = nparallel_workers;
+
+ /* Determine the number of parallel vacuum workers to use */
+ tab->at_params.nworkers = 0;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers == 0)
+ {
+ /*
+ * Disable parallel vacuum, if the reloption sets the parallel
+ * degree as zero.
+ */
+ tab->at_params.nworkers = -1;
+ }
+ else if (avopts->autovacuum_parallel_workers > 0)
+ tab->at_params.nworkers = avopts->autovacuum_parallel_workers;
+
+ /*
+ * autovacuum_parallel_workers == -1 falls through, keep
+ * nworkers=0
+ */
+ }
/*
* Later, in vacuum_rel(), we check reloptions for any
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 8265a82b639..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,7 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
-int autovacuum_max_parallel_workers = 2;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 45b39b7c47f..1ac8e8fc3be 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3359,9 +3359,10 @@ set_config_with_handle(const char *name, config_handle *handle,
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
* Other changes might need to affect other workers, so forbid them. Note,
- * that parallel autovacuum leader is an exception, because only
- * cost-based delays need to be affected also to parallel autovacuum
- * workers, and we will handle it elsewhere if appropriate.
+ * that parallel autovacuum leader is an exception because only cost-based
+ * delays need to be affected also to parallel autovacuum workers. These
+ * parameters are propagated to its workers during parallel vacuum (see
+ * vacuumparallel.c for details).
*/
if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
action != GUC_ACTION_SAVE &&
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index e16c756e2cc..6ef46d88155 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -173,7 +173,7 @@
{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
variable => 'autovacuum_max_parallel_workers',
- boot_val => '2',
+ boot_val => '0',
min => '0',
max => 'MAX_PARALLEL_WORKER_LIMIT',
},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 6097d967ec1..73c49f09aef 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,7 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
-#autovacuum_max_parallel_workers = 2 # limited by max_parallel_workers
+#autovacuum_max_parallel_workers = 0 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index 0364019d5f0..2aca32374a2 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -12,7 +12,7 @@ if ($ENV{enable_injection_points} ne 'yes')
# Before each test we should disable autovacuum for 'test_autovac' table and
# generate some dead tuples in it. Returns the current autovacuum_count of
-# the table tset_autovac.
+# the table test_autovac.
sub prepare_for_next_test
{
my ($node, $test_number) = @_;
@@ -47,16 +47,20 @@ my $psql_out;
my $node = PostgreSQL::Test::Cluster->new('main');
$node->init;
-# Configure postgres, so it can launch parallel autovacuum workers, log all
-# information we are interested in and autovacuum works frequently
+# Limit to one autovacuum worker and disable autovacuum logging globally
+# (enabled only on the test table) so that log checks below match only
+# activity on the expected table.
$node->append_conf(
'postgresql.conf', qq{
- max_worker_processes = 20
- max_parallel_workers = 20
- autovacuum_max_parallel_workers = 4
- log_min_messages = debug2
- autovacuum_naptime = '1s'
- min_parallel_index_scan_size = 0
+autovacuum_max_workers = 1
+autovacuum_worker_slots = 1
+autovacuum_max_parallel_workers = 2
+max_worker_processes = 10
+max_parallel_workers = 10
+log_min_messages = debug2
+autovacuum_naptime = '1s'
+min_parallel_index_scan_size = 0
+log_autovacuum_min_duration = -1
});
$node->start;
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-30 00:17 SATYANARAYANA NARLAPURAM <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: SATYANARAYANA NARLAPURAM @ 2026-03-30 00:17 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi
On Sat, Mar 28, 2026 at 4:11 AM Daniil Davydov <[email protected]> wrote:
> Hi,
>
> On Thu, Mar 26, 2026 at 5:43 AM Masahiko Sawada <[email protected]>
> wrote:
> >
> > On Wed, Mar 25, 2026 at 12:45 AM Daniil Davydov <[email protected]>
> wrote:
> > >
> > > Searching for arguments in
> > > favor of opt-in style, I asked for help from another person who has
> been
> > > managing the setup of highload systems for decades. He promised to
> share his
> > > opinion next week.
> >
> > Given that we have one and half weeks before the feature freeze, I
> > think it's better to complete the project first before waiting for
> > his/her comments next week. Even if we finish this feature with the
> > opt-out style, we can hear more opinions on it and change the default
> > behavior as the change would be privial. What do you think?
> >
>
> Sure, if we can change the default value after the feature freeze, I don't
> mind leaving our parameter in opt-out style by now.
>
> > I've squashed all patches except for the documentation patch as I
> > assume you're working on it. The attached fixup patch contains several
> > changes: using opt-out style, comment improvements, and fixing typos
> > etc.
> >
>
> Thank you very much for the proposed fixes!
> I like the way you have changed nparallel_workers calculation
> (autovacuum.c).
> Forcing parallel workers to always read shared cost params at the first
> time
> is a good decision. All comments changes are also LGTM.
>
> The only place that I have changed is reloptions.c :
> As you have explained, it is not appropriate to use the "overrides" wording
> in the reloption's description, so I decided to return an old one.
>
> On Fri, Mar 27, 2026 at 10:54 AM Bharath Rupireddy
> <[email protected]> wrote:
> >
> > Hi,
> >
> > On Wed, Mar 25, 2026 at 3:43 PM Masahiko Sawada <[email protected]>
> wrote:
> > >
> > > Given that we have one and half weeks before the feature freeze, I
> > > think it's better to complete the project first before waiting for
> > > his/her comments next week. Even if we finish this feature with the
> > > opt-out style, we can hear more opinions on it and change the default
> > > behavior as the change would be privial. What do you think?
> > >
> > > I've squashed all patches except for the documentation patch as I
> > > assume you're working on it. The attached fixup patch contains several
> > > changes: using opt-out style, comment improvements, and fixing typos
> > > etc.
> >
> > +1 for enabling this feature by default. When enough CPU is available,
> > vacuuming multiple indexes of a table in parallel in autovacuum
> > definitely speeds things up.
>
> Yes, for sure. But I have concerns that enabling parallel a/v for everyone
> will cause the parallel workers shortage during processing of the most huge
> tables.
>
> > Thank you for sending the latest patches. I quickly reviewed the v31
> > patches. Here are some comments.
> >
> > 1/ + {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
> >
> > I haven't looked at the whole thread, but do we all think we need this
> > as a relopt? IMHO, we can wait for field experience and introduce this
> > later.
>
> I think that we should leave both reloption and the config parameter.
> Getting rid from the reloption will greatly reduce the ability of users to
> tune this feature. I'm afraid that this may lead to people not using
> parallel
> autovacuum.
>
> > I'm having a hard time finding a use-case where one wants to
> > disable the indexes at the table level. If there was already an
> > agreement, I agree to commit to that decision.
>
> You can read discussion from [1] to the current message in order to dive
> into
> the question.
>
> To make the long story short, I think that the most common use case for
> this
> feature is allowing parallel a/v for 2-3 tables, each of which has ~100
> indexes. The rest of the tables do not require parallel processing (at
> least
> it's a much lower priority for them).
>
> At the same time, Masahiko-san thinks that only the system should decide
> which
> tables will be processed in parallel. System's decision should be based on
> the
> number of indexes and a few other config parameters (e.g.
> min_parallel_index_scan_size). Thus, possibly many tables will be able to
> be
> processed in parallel.
>
> (Both opinions are pretty simplified).
>
> >
> > 2/ + /*
> > + * If 'true' then we are running parallel autovacuum. Otherwise, we
> are
> > + * running parallel maintenence VACUUM.
> > + */
> > + bool is_autovacuum;
> > +
> >
> > The variable name looks a bit confusing. How about we rely on
> > AmAutoVacuumWorkerProcess() and avoid the bool in shared memory?
>
> This variable is needed for parallel workers, which are taken from the
> bgworkers pool. I.e. AmAutovacuumWorker() will return 'false' for them.
> We need the "is_autovacuum" variable in order to understand exactly what
> this
> process was started for (VACUUM PARALLEL or parallel autovacuum).
>
>
> Thanks everyone for the review!
> Please, see an updated set of patches :
> As I promised, I created a dedicated chapter for Parallel Vacuum
> description.
> Both maintenance VACUUM and autovacuum now refer to this chapter.
>
> I am pretty inexperienced in the documentation writing, so forgive me if
> something is out of code style.
>
> [1]
> https://www.postgresql.org/message-id/CAJDiXggH1bW%3D4n%2B55CGLvs_sRU4SYNXwYLZ37wvJ5H_3yURSPw%40mail...
Thank you for working on this, very useful feature. Sharing a few thoughts:
1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM
resources in parallel_vacuum_compute_workers?
2. Is it intentional that other autovacuum workers not yield cost limits to
the parallel auto vacuum workers? Cost limits are distributed first equally
to the autovacuum workers.
and then they share that. Therefore, parallel workers will be heavily
throttled. IIUC, this problem doesn't exist with manual vacuum.
If we don't fix this, at least we should document this.
3. Additionally, is there a point where, based on the cost limits,
launching additional workers becomes counterproductive compared to running
fewer workers and preventing it?
4. Would it make sense to add a table level override to disable parallelism
or set parallel worker count?
I ran some perf tests to show the improvements with parallel vacuum and
shared below.
System Configuration
--------------------
Hardware:
CPU: 16 cores
RAM: 128 GB
Storage: NVMe SSDs
OS: Ubuntu Linux
Workload Description
--------------------
Table: avtest
- 5,000,000 rows
- 9 columns: id (bigint PK), col1-col5 (int), col6 (text), col7
(timestamp),
padding (text, 50 bytes)
- 8 indexes:
avtest_pkey (col: id) 107 MB
idx_av_col7 (col: col7) 107 MB
idx_av_col2 (col: col2) 56 MB
idx_av_col4 (col: col4) 56 MB
idx_av_col5 (col: col5) 56 MB
idx_av_col1 (col: col1) 56 MB
idx_av_col3 (col: col3) 56 MB
idx_av_col6 (col: col6) 35 MB
- Total size: 1171 MB
Each test iteration:
1. Delete 2,000,000 rows (40%) using: DELETE WHERE id % 5 IN (1, 2)
2. CHECKPOINT to flush dirty pages
3. Trigger autovacuum by setting autovacuum_vacuum_threshold = 100 and
autovacuum_vacuum_scale_factor = 0 on the table
4. Wait for autovacuum to complete (detected via server log)
5. Re-insert the deleted rows and VACUUM to restore the table for the
next run
Test Methodology
----------------
Worker configurations tested: 0, 2, 4, 7 parallel workers
(7 is the maximum: nindexes - 1, since the leader always handles one
index)
Two experiments were run with different cost-based vacuum delay settings:
Experiment A: cost_limit=200, cost_delay=2ms
Experiment B: cost_limit=60, cost_delay=2ms
Common server settings for both experiments:
shared_buffers = 120 GB (entire dataset fits in shared buffers)
maintenance_work_mem = 1 GB
max_wal_size = 100 GB (prevents checkpoints during vacuum)
min_wal_size = 10 GB
checkpoint_timeout = 1 hour (prevents time-based checkpoints)
wal_buffers = 128 MB
max_parallel_workers = 16
max_worker_processes = 24
autovacuum_naptime = 1s
Between every single run:
1. PostgreSQL server is fully stopped (pg_ctl stop -m fast)
2. OS page cache is dropped (echo 3 > /proc/sys/vm/drop_caches)
3. Server is restarted with a clean log file
4. After DELETE and CHECKPOINT, the server is stopped again, OS caches
dropped again, and the server restarted -- so vacuum starts fully cold
5. The autovacuum_max_parallel_workers GUC is reloaded via pg_ctl reload
Each configuration was tested for 5 iterations.
Timing is extracted from the PostgreSQL server log "system usage" line that
autovacuum emits at completion. This reports elapsed wall-clock time and CPU
time for the autovacuum worker leader process.
Results: Experiment A (cost_limit=200, cost_delay=2ms)
------------------------------------------------------
Workers Iter1 Iter2 Iter3 Iter4 Iter5 Avg(s) Speedup
------- ------ ------ ------ ------ ------ ------ -------
0 66.21 79.11 66.27 77.11 66.30 71.00 1.00x
2 66.55 53.27 52.66 55.74 55.71 56.78 1.25x
4 51.50 51.74 65.07 52.06 70.25 58.12 1.22x
7 50.05 50.35 50.04 50.12 50.07 50.12 1.41x
CPU usage (leader process only):
Workers Avg CPU user Avg CPU sys
------- ----------- ----------
0 3.04s 1.70s
2 1.24s 1.50s
4 0.78s 1.49s
7 0.79s 1.48s
Results: Experiment B (cost_limit=60, cost_delay=2ms)
-----------------------------------------------------
Workers Iter1 Iter2 Iter3 Iter4 Iter5 Avg(s) Speedup
------- ------ ------ ------ ------ ------ ------ -------
0 199.00 195.26 191.44 191.90 191.67 193.85 1.00x
2 160.68 181.33 176.85 167.84 159.47 169.23 1.14x
4 154.02 165.02 174.33 164.16 156.53 162.81 1.19x
7 148.49 158.68 160.66 154.37 149.20 154.28 1.25x
CPU usage (leader process only):
Workers Avg CPU user Avg CPU sys
------- ----------- ----------
0 3.06s 1.90s
2 1.28s 1.72s
4 0.80s 1.69s
7 0.82s 1.68s
*Observations:*
1. Parallel autovacuum provides consistent speedup. With cost_limit=200 and
7 workers, vacuum completes 1.41x faster (71s -> 50s). With
cost_limit=60,
the speedup is 1.25x (194s -> 154s).
2. I see the benefit comes from parallelizing index vacuum. With 8 indexes
totaling
~530 MB, parallel workers scan indexes concurrently instead of the leader
scanning them one by one. The leader's CPU user time drops from ~3s to
~0.8s as index work is offloaded
Thanks,
Satya
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-30 08:44 Daniil Davydov <[email protected]>
parent: SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 3 replies; 112+ messages in thread
From: Daniil Davydov @ 2026-03-30 08:44 UTC (permalink / raw)
To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Mon, Mar 30, 2026 at 7:17 AM SATYANARAYANA NARLAPURAM
<[email protected]> wrote:
>
> Thank you for working on this, very useful feature. Sharing a few thoughts:
>
> 1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM resources in parallel_vacuum_compute_workers?
Actually, autovacuum_max_parallel_workers is already limited by
max_parallel_workers. It is not clear for me why we allow setting this GUC
higher than max_parallel_workers, but if this happens, I think it is a user's
misconfiguration.
> 2. Is it intentional that other autovacuum workers not yield cost limits to the parallel auto vacuum workers? Cost limits are distributed first equally to the autovacuum workers.
> and then they share that. Therefore, parallel workers will be heavily throttled. IIUC, this problem doesn't exist with manual vacuum.
> If we don't fix this, at least we should document this.
Parallel a/v workers inherit cost based parameters (including the
vacuum_cost_limit) from the leader worker. Do you mean that this can be too
low value for parallel operation? If so, user can manually increase the
vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too
much (due to cost delay).
BTW, describing the cost limit propagation to the parallel a/v workers is
worth mentioning in the documentation. I'll add it in the next patch version.
> 3. Additionally, is there a point where, based on the cost limits, launching additional workers becomes counterproductive compared to running fewer workers and preventing it?
I don't think that we can possibly find a universal limit that will be
appropriate for all possible configurations. By now we are using a pretty
simple formula for parallel degree calculation. Since user have several ways
to affect this formula, I guess that there will be no problems with it (except
my concerns about opt-out style).
> 4. Would it make sense to add a table level override to disable parallelism or set parallel worker count?
We already have the "autovacuum_parallel_workers" reloption that is used as
an additional limit for the number of parallel workers. In particular, this
reloption can be used to disable parallelism at all.
>
> I ran some perf tests to show the improvements with parallel vacuum and shared below.
Thank you very much!
> Observations:
>
> 1. Parallel autovacuum provides consistent speedup. With cost_limit=200 and
> 7 workers, vacuum completes 1.41x faster (71s -> 50s). With cost_limit=60,
> the speedup is 1.25x (194s -> 154s).
> 2. I see the benefit comes from parallelizing index vacuum. With 8 indexes totaling
> ~530 MB, parallel workers scan indexes concurrently instead of the leader
> scanning them one by one. The leader's CPU user time drops from ~3s to
> ~0.8s as index work is offloaded
>
1.41 speedup with 7 parallel workers may not seem like a great win, but it is
a whole time of autovacuum operation (not only index bulkdel/cleanup) with
pretty small indexes.
May I ask you to run the same test with a higher table's size (several dozen
gigabytes)? I think the results will be more "expressive".
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-30 10:40 Daniil Davydov <[email protected]>
parent: Daniil Davydov <[email protected]>
2 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-30 10:40 UTC (permalink / raw)
To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Mon, Mar 30, 2026 at 3:44 PM Daniil Davydov <[email protected]> wrote:
>
> BTW, describing the cost limit propagation to the parallel a/v workers is
> worth mentioning in the documentation. I'll add it in the next patch version.
>
You can find these changes in the v33 patch.
I mentioned cost delay parameters propagation only in "The Autovacuum Daemon"
chapter. I am not sure that we also should write about parallel workers in the
"Vacuuming" chapter (within cost based parameters description) since VACUUM
PARALLEL doesn't do so.
The only change in the 0001 patch is removing redundant empty line
inside autovacuum.c .
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v33-0001-Parallel-autovacuum.patch (32.0K, 2-v33-0001-Parallel-autovacuum.patch)
download | inline diff:
From cddb1c13c57f270484606ff2104e2b985a5a7939 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v33 1/2] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 +
src/backend/access/heap/vacuumlazy.c | 12 +
src/backend/commands/vacuum.c | 21 +-
src/backend/commands/vacuumparallel.c | 223 +++++++++++++++++-
src/backend/postmaster/autovacuum.c | 25 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 9 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/commands/vacuum.h | 2 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 2 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 195 +++++++++++++++
src/tools/pgindent/typedefs.list | 1 +
20 files changed, 538 insertions(+), 14 deletions(-)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index a6002ae9b07..56dc99bdbe1 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -236,6 +236,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1969,6 +1978,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f698c2d899b..9fd4f6febbe 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -862,6 +863,17 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause before parallel vacuum is launched, allowing
+ * test code to modify configuration that the leader then propagates to
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..1b5ba3ce1ef 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..13544de5b93 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -16,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -37,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -51,6 +61,33 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Parallel
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters. The generation starts from 1 so that a freshly
+ * started worker (whose local copy is 0) will always load the initial
+ * parameters on its first check.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -120,6 +157,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenance VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Cost-based vacuum delay parameters shared between the autovacuum leader
+ * and its parallel workers.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -222,6 +271,17 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/*
+ * Worker-local copy of the last cost-parameter generation this worker has
+ * applied. Initialized to 0; since the leader initializes the shared
+ * generation counter to 1, the first call to
+ * parallel_vacuum_update_shared_delay_params() will always detect a
+ * mismatch and read the initial parameters from shared memory.
+ */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -233,6 +293,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -374,8 +435,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -392,6 +454,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 1);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -457,6 +534,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -534,6 +614,103 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel workers, this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the worker is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters have changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -555,12 +732,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +781,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -730,6 +912,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause after workers are launched but before index
+ * vacuuming begins.
+ */
+ if (nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1064,7 +1256,21 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
shared->dead_items_handle);
/* Set cost-based vacuum delay */
- VacuumUpdateCosts();
+ if (shared->is_autovacuum)
+ {
+ /*
+ * Parallel autovacuum workers initialize cost-based delay parameters
+ * from the leader's shared state rather than GUC defaults, because
+ * the leader may have applied per-table or autovacuum-specific
+ * overrides. pv_shared_cost_params must be set before calling
+ * parallel_vacuum_update_shared_delay_params().
+ */
+ pv_shared_cost_params = &(shared->cost_params);
+ parallel_vacuum_update_shared_delay_params();
+ }
+ else
+ VacuumUpdateCosts();
+
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(shared->cost_balance);
@@ -1119,6 +1325,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d695f1de4bd..c6c4f0dbb55 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1688,7 +1688,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
@@ -2928,8 +2928,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2939,6 +2937,27 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ /* Determine the number of parallel vacuum workers to use */
+ tab->at_params.nworkers = 0;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers == 0)
+ {
+ /*
+ * Disable parallel vacuum, if the reloption sets the parallel
+ * degree as zero.
+ */
+ tab->at_params.nworkers = -1;
+ }
+ else if (avopts->autovacuum_parallel_workers > 0)
+ tab->at_params.nworkers = avopts->autovacuum_parallel_workers;
+
+ /*
+ * autovacuum_parallel_workers == -1 falls through, keep
+ * nworkers=0
+ */
+ }
+
/*
* Later, in vacuum_rel(), we check reloptions for any
* vacuum_max_eager_freeze_failure_rate override.
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e1546d9c97a..1ac8e8fc3be 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3358,9 +3358,14 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception because only cost-based
+ * delays need to be affected also to parallel autovacuum workers. These
+ * parameters are propagated to its workers during parallel vacuum (see
+ * vacuumparallel.c for details).
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0a862693fcd..6ef46d88155 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -170,6 +170,14 @@
max => '10.0',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cf15597385b..73c49f09aef 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 523d3f39fc5..f6bf072bab5 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f45bca015c..8b42808e70b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -422,6 +422,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..00190c67ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ int autovacuum_parallel_workers;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..336a212faf4 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..929659956cb 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..2aca32374a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,195 @@
+# Test parallel autovacuum behavior
+
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table test_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql(
+ 'postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until(
+ 'postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Limit to one autovacuum worker and disable autovacuum logging globally
+# (enabled only on the test table) so that log checks below match only
+# activity on the expected table.
+$node->append_conf(
+ 'postgresql.conf', qq{
+autovacuum_max_workers = 1
+autovacuum_worker_slots = 1
+autovacuum_max_parallel_workers = 2
+max_worker_processes = 10
+max_parallel_workers = 10
+log_min_messages = debug2
+autovacuum_naptime = '1s'
+min_parallel_index_scan_size = 0
+log_autovacuum_min_duration = -1
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql(
+ 'postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing');
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e3c1007abdf..aeaf9307558 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2097,6 +2097,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.43.0
[text/x-patch] v33-0002-Documantation-for-parallel-autovacuum.patch (9.2K, 3-v33-0002-Documantation-for-parallel-autovacuum.patch)
download | inline diff:
From 235f5c8b31db8b81ff9735707c95128201131d2d Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sat, 28 Mar 2026 17:37:07 +0700
Subject: [PATCH v33 2/2] Documantation for parallel autovacuum
---
doc/src/sgml/config.sgml | 19 ++++++++++++++
doc/src/sgml/maintenance.sgml | 42 ++++++++++++++++++++++++++++++
doc/src/sgml/ref/create_table.sgml | 18 +++++++++++++
doc/src/sgml/ref/vacuum.sgml | 24 +++--------------
4 files changed, 83 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 229f41353eb..cbe73172b2e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9485,6 +9486,24 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for the <xref linkend="parallel-vacuum"/>
+ by a single autovacuum worker. Is capped by <xref linkend="guc-max-parallel-workers"/>.
+ The default is 0, which means no parallel index vacuuming for this
+ table.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0d2a28207ed..e350226610a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -927,6 +927,13 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to perform the
+ <xref linkend="parallel-vacuum"/>.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
@@ -1038,6 +1045,10 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
per-table <literal>autovacuum_vacuum_cost_delay</literal> or
<literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
are not considered in the balancing algorithm.
+ Parallel workers launched for <xref linkend="parallel-vacuum"/> are using
+ the same cost delay parameters as the leader worker. If any of these
+ parameters are changed in the leader worker, it will propagate the new
+ parameter values to all of its parallel workers.
</para>
<para>
@@ -1166,6 +1177,37 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
</sect3>
</sect2>
+
+ <sect2 id="parallel-vacuum" xreflabel="Parallel Vacuum">
+ <title>Parallel Vacuum</title>
+
+ <para>
+ Both <command>VACUUM</command> command and autovacuum daemon can perform
+ index vacuum and index cleanup phases in parallel using
+ <replaceable class="parameter">integer</replaceable> background workers
+ (for the details of each vacuum phase, please refer to
+ <xref linkend="vacuum-phases"/>). The number of workers used to perform
+ the operation is equal to the number of indexes on the relation that
+ support parallel vacuum which may be further limited by the value specific
+ to <command>VACUUM</command> or autovacuum (see <literal>PARALLEL</literal>
+ option for <xref linkend="sql-vacuum"/> or
+ <xref linkend="reloption-autovacuum-parallel-workers"/>, respectively).
+ </para>
+
+ <para>
+ An index can participate in parallel vacuum if and only if the size of the
+ index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+ Please note that it is not guaranteed that the number of parallel workers
+ specified in <replaceable class="parameter">integer</replaceable> will be
+ used during execution. It is possible for a vacuum to run with fewer
+ workers than specified, or even with no workers at all. Only one worker
+ can be used per index. So parallel workers are launched only when there
+ are at least <literal>2</literal> indexes in the table. Workers for
+ vacuum are launched before the start of each phase and exit at the end of
+ the phase. These behaviors might change in a future release.
+ </para>
+
+ </sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 80829b23945..8f5b25411cd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1738,6 +1738,24 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can be used
+ for the <xref linkend="parallel-vacuum"/>.
+ The default value is 0, which means no parallel index vacuuming for
+ this table. If value is -1 then parallel degree will computed based on
+ number of indexes and limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index ac5d083d468..d3f32851386 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -81,7 +81,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
is not obtained. However, extra space is not returned to the operating
system (in most cases); it's just kept available for re-use within the
same table. It also allows us to leverage multiple CPUs in order to process
- indexes. This feature is known as <firstterm>parallel vacuum</firstterm>.
+ indexes. This feature is known as <xref linkend="parallel-vacuum"/>.
To disable this feature, one can use <literal>PARALLEL</literal> option and
specify parallel workers as zero. <command>VACUUM FULL</command> rewrites
the entire contents of the table into a new disk file with no extra space,
@@ -266,25 +266,9 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<term><literal>PARALLEL</literal></term>
<listitem>
<para>
- Perform index vacuum and index cleanup phases of <command>VACUUM</command>
- in parallel using <replaceable class="parameter">integer</replaceable>
- background workers (for the details of each vacuum phase, please
- refer to <xref linkend="vacuum-phases"/>). The number of workers used
- to perform the operation is equal to the number of indexes on the
- relation that support parallel vacuum which is limited by the number of
- workers specified with <literal>PARALLEL</literal> option if any which is
- further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
- An index can participate in parallel vacuum if and only if the size of the
- index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
- Please note that it is not guaranteed that the number of parallel workers
- specified in <replaceable class="parameter">integer</replaceable> will be
- used during execution. It is possible for a vacuum to run with fewer
- workers than specified, or even with no workers at all. Only one worker
- can be used per index. So parallel workers are launched only when there
- are at least <literal>2</literal> indexes in the table. Workers for
- vacuum are launched before the start of each phase and exit at the end of
- the phase. These behaviors might change in a future release. This
- option can't be used with the <literal>FULL</literal> option.
+ Limits the number of parallel workers used to perform the
+ <xref linkend="parallel-vacuum"/>, which is further limited by
+ <xref linkend="guc-max-parallel-maintenance-workers"/>.
</para>
</listitem>
</varlistentry>
--
2.43.0
[text/x-patch] v32--v33-diff-for-0002.patch (1.0K, 4-v32--v33-diff-for-0002.patch)
download | inline diff:
From 1b41d6e7a95f3b8507ddc7f10a2a44b6f9bdf1a4 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Mon, 30 Mar 2026 17:23:23 +0700
Subject: [PATCH] temp
---
doc/src/sgml/maintenance.sgml | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index a1b851ea58f..e350226610a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -1045,6 +1045,10 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
per-table <literal>autovacuum_vacuum_cost_delay</literal> or
<literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
are not considered in the balancing algorithm.
+ Parallel workers launched for <xref linkend="parallel-vacuum"/> are using
+ the same cost delay parameters as the leader worker. If any of these
+ parameters are changed in the leader worker, it will propagate the new
+ parameter values to all of its parallel workers.
</para>
<para>
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-31 00:14 SATYANARAYANA NARLAPURAM <[email protected]>
parent: Daniil Davydov <[email protected]>
2 siblings, 1 reply; 112+ messages in thread
From: SATYANARAYANA NARLAPURAM @ 2026-03-31 00:14 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi
On Mon, Mar 30, 2026 at 1:44 AM Daniil Davydov <[email protected]> wrote:
> Hi,
>
> On Mon, Mar 30, 2026 at 7:17 AM SATYANARAYANA NARLAPURAM
> <[email protected]> wrote:
> >
> > Thank you for working on this, very useful feature. Sharing a few
> thoughts:
> >
> > 1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM
> resources in parallel_vacuum_compute_workers?
>
> Actually, autovacuum_max_parallel_workers is already limited by
> max_parallel_workers. It is not clear for me why we allow setting this GUC
> higher than max_parallel_workers, but if this happens, I think it is a
> user's
> misconfiguration.
>
> > 2. Is it intentional that other autovacuum workers not yield cost limits
> to the parallel auto vacuum workers? Cost limits are distributed first
> equally to the autovacuum workers.
> > and then they share that. Therefore, parallel workers will be heavily
> throttled. IIUC, this problem doesn't exist with manual vacuum.
> > If we don't fix this, at least we should document this.
>
> Parallel a/v workers inherit cost based parameters (including the
> vacuum_cost_limit) from the leader worker. Do you mean that this can be too
> low value for parallel operation? If so, user can manually increase the
> vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too
> much (due to cost delay).
>
> BTW, describing the cost limit propagation to the parallel a/v workers is
> worth mentioning in the documentation. I'll add it in the next patch
> version.
>
> > 3. Additionally, is there a point where, based on the cost limits,
> launching additional workers becomes counterproductive compared to running
> fewer workers and preventing it?
>
> I don't think that we can possibly find a universal limit that will be
> appropriate for all possible configurations. By now we are using a pretty
> simple formula for parallel degree calculation. Since user have several
> ways
> to affect this formula, I guess that there will be no problems with it
> (except
> my concerns about opt-out style).
>
> > 4. Would it make sense to add a table level override to disable
> parallelism or set parallel worker count?
>
> We already have the "autovacuum_parallel_workers" reloption that is used as
> an additional limit for the number of parallel workers. In particular, this
> reloption can be used to disable parallelism at all.
>
> >
> > I ran some perf tests to show the improvements with parallel vacuum and
> shared below.
>
> Thank you very much!
>
> > Observations:
> >
> > 1. Parallel autovacuum provides consistent speedup. With cost_limit=200
> and
> > 7 workers, vacuum completes 1.41x faster (71s -> 50s). With
> cost_limit=60,
> > the speedup is 1.25x (194s -> 154s).
> > 2. I see the benefit comes from parallelizing index vacuum. With 8
> indexes totaling
> > ~530 MB, parallel workers scan indexes concurrently instead of the
> leader
> > scanning them one by one. The leader's CPU user time drops from ~3s to
> > ~0.8s as index work is offloaded
> >
>
> 1.41 speedup with 7 parallel workers may not seem like a great win, but it
> is
> a whole time of autovacuum operation (not only index bulkdel/cleanup) with
> pretty small indexes.
>
> May I ask you to run the same test with a higher table's size (several
> dozen
> gigabytes)? I think the results will be more "expressive".
>
I ran it with a Billion rows in a table with 8 indexes. The improvement
with 7 workers is 1.8x.
Please note that there is a fixed overhead in other vacuum steps, for
example heap scan.
In the environments where cost-based delay is used (the default), benefits
will be modest
unless vacuum_cost_delay is set to sufficiently large value.
*Hardware:* CPU: Intel Xeon Platinum 8573C, 1 socket × 8 cores × 2
threads = 16 vCPUs
RAM: 128 GB (131,900 MB)
Swap: None
*Workload Description*
*Table Schema:*
CREATE TABLE avtest (
id bigint PRIMARY KEY,
col1 int, -- random()*1e9
col2 int, -- random()*1e9
col3 int, -- random()*1e9
col4 int, -- random()*1e9
col5 int, -- random()*1e9
col6 text, -- 'text_' || random()*1e6 (short text ~10
chars)
col7 timestamp, -- now() - random()*365 days
padding text -- repeat('x', 50)
) WITH (fillfactor = 90);
*Indexes (8 total):*
avtest_pkey — btree on (id) bigint
idx_av_col1 — btree on (col1) int
idx_av_col2 — btree on (col2) int
idx_av_col3 — btree on (col3) int
idx_av_col4 — btree on (col4) int
idx_av_col5 — btree on (col5) int
idx_av_col6 — btree on (col6) text
idx_av_col7 — btree on (col7) timestamp
Dead Tuple Generation:
DELETE FROM avtest WHERE id % 5 IN (1, 2);
This deletes exactly 40% of rows, uniformly distributed across all pages.
Vacuum Trigger:
Autovacuum is triggered naturally by lowering the threshold to 0 and
setting
scale_factor to a value that causes immediate launch after the DELETE.
Worker Configurations Tested:
0 workers — leader-only vacuum (baseline, no parallelism)
2 workers — leader + 2 parallel workers (3 processes total)
4 workers — leader + 4 parallel workers (5 processes total)
7 workers — leader + 7 parallel workers (8 processes total, 1 per index)
Dataset:
Rows: 1,000,000,000
Heap size: 139 GB
Total size: 279 GB (heap + 8 indexes)
Dead tuples: 400,000,000 (40%)
Index Sizes:
avtest_pkey 21 GB (bigint)
idx_av_col7 21 GB (timestamp)
idx_av_col1 18 GB (int)
idx_av_col2 18 GB (int)
idx_av_col3 18 GB (int)
idx_av_col4 18 GB (int)
idx_av_col5 18 GB (int)
idx_av_col6 7 GB (text — shorter keys, smaller index)
Total indexes: 139 GB
Server Settings:
shared_buffers = 96GB
maintenance_work_mem = 1GB
max_wal_size = 100GB
checkpoint_timeout = 1h
autovacuum_vacuum_cost_delay = 0ms (NO throttling)
autovacuum_vacuum_cost_limit = 1000
*Summary:*
Workers Avg(s) Min(s) Max(s) Speedup Time Saved
------- ------ ------ ------ ------- ----------
0 1645.93 1645.01 1646.84 1.00x —
2 1276.35 1275.64 1277.05 1.29x 369.58s (6.2 min)
4 1052.62 1048.92 1056.32 1.56x 593.31s (9.9 min)
7 892.23 886.59 897.86 1.84x 753.70s (12.6 min)
Thanks,
Satya
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-31 07:09 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-31 07:09 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, Mar 30, 2026 at 3:40 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Mon, Mar 30, 2026 at 3:44 PM Daniil Davydov <[email protected]> wrote:
> >
> > BTW, describing the cost limit propagation to the parallel a/v workers is
> > worth mentioning in the documentation. I'll add it in the next patch version.
> >
>
> You can find these changes in the v33 patch.
>
> I mentioned cost delay parameters propagation only in "The Autovacuum Daemon"
> chapter. I am not sure that we also should write about parallel workers in the
> "Vacuuming" chapter (within cost based parameters description) since VACUUM
> PARALLEL doesn't do so.
>
> The only change in the 0001 patch is removing redundant empty line
> inside autovacuum.c .
Thank you for updating the patch!
I've made some changes to the documentation part, merged two patches
into one, and updated the commit message. Please review the attached
patch.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-patch] v34-0001-Allow-autovacuum-to-use-parallel-vacuum-workers.patch (42.4K, 2-v34-0001-Allow-autovacuum-to-use-parallel-vacuum-workers.patch)
download | inline diff:
From a389a11d7f47a01f8f62d32b3aa56c19c21871ee Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v34] Allow autovacuum to use parallel vacuum workers.
Previously, autovacuum always disabled parallel vacuum regardless of
the table's index count or configuration. This commit enables
autovacuum workers to use parallel index vacuuming and index cleanup,
using the same parallel vacuum infrastructure as manual VACUUM.
Two new configuration options control the feature. The GUC
autovacuum_max_parallel_workers sets the maximum number of parallel
workers a single autovacuum worker may launch; it defaults to 0,
preserving existing behavior unless explicitly enabled. The per-table
storage parameter autovacuum_parallel_workers provides per-table limits.
A value of 0 disables parallel vacuum for the table, a positive value
caps the worker count (still bounded by the GUC), and -1 (the default)
defers to the GUC.
To handle cases where autovacuum workers receive a SIGHUP and update
their cost-based vacuum delay parameters mid-operation, a new
propagation mechanism is added to vacuumparallel.c. The leader stores
its effective cost parameters in a DSM segment. Parallel vacuum
workers poll for changes in vacuum_delay_point(); if an update is
detected, they apply the new values locally via VacuumUpdateCosts().
A new test module, src/test/modules/test_autovacuum, is added to
verify that parallel autovacuum workers are correnctly launched and
that cost-parameter updates are propagated as expected.
Author: Daniil Davydov <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Sami Imseih <[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Reviewed-by: Bharath Rupireddy <[email protected]>
Reviewed-by: Alexander Korotkov <[email protected]>
Reviewed-by: zengman <[email protected]>
Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com
---
doc/src/sgml/config.sgml | 23 ++
doc/src/sgml/maintenance.sgml | 32 +++
doc/src/sgml/ref/create_table.sgml | 16 ++
doc/src/sgml/ref/vacuum.sgml | 23 +-
src/backend/access/common/reloptions.c | 11 +
src/backend/access/heap/vacuumlazy.c | 12 +
src/backend/commands/vacuum.c | 21 +-
src/backend/commands/vacuumparallel.c | 223 +++++++++++++++++-
src/backend/postmaster/autovacuum.c | 25 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 9 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/commands/vacuum.h | 2 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 2 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 197 ++++++++++++++++
src/tools/pgindent/typedefs.list | 1 +
24 files changed, 615 insertions(+), 33 deletions(-)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 229f41353eb..70d8771f0af 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9485,6 +9486,28 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel workers that can be used by a
+ single autovacuum worker to process indexes. This limit applies
+ specifically to the index vacuuming and index cleanup phases of
+ autovacuum. The actual number of parallel workers is further limited
+ by <xref linkend="guc-max-parallel-workers"/>. This is the
+ per-autovacuum worker equivalent of the <literal>PARALLEL</literal>
+ option of the <link linkend="sql-vacuum"><command>VACUUM</command></link>
+ command. Setting this value to 0 disables parallel vacuum during autovacuum.
+ The default is 0.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0d2a28207ed..eb6a07e086d 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -1038,6 +1038,10 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
per-table <literal>autovacuum_vacuum_cost_delay</literal> or
<literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
are not considered in the balancing algorithm.
+ Parallel workers launched for <xref linkend="parallel-vacuum"/> are using
+ the same cost delay parameters as the leader worker. If any of these
+ parameters are changed in the leader worker, it will propagate the new
+ parameter values to all of its parallel workers.
</para>
<para>
@@ -1166,6 +1170,34 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
</sect3>
</sect2>
+
+ <sect2 id="parallel-vacuum" xreflabel="Parallel Vacuum">
+ <title>Parallel Vacuum</title>
+
+ <para>
+ <command>VACUUM</command> can perform index vacuuming and index cleanup
+ phases in parallel using background workers (for the details of each
+ vacuum phase, please refer to <xref linkend="vacuum-phases"/>). The
+ degree of parallelism is determined by the number of indexes on the
+ relation that support parallel vacuum, limited by the <literal>PARALLEL</literal>
+ (for manual <command>VACUUM</command>) or the
+ <xref linkend="guc-autovacuum-max-parallel-workers"/> parameters (for
+ autovacuum).
+ </para>
+
+ <para>
+ An index can participate in parallel vacuum if and only if the size of the
+ index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+ Please note that it is not guaranteed that the number of parallel workers
+ specified in <replaceable class="parameter">integer</replaceable> will be
+ used during execution. It is possible for a vacuum to run with fewer
+ workers than specified, or even with no workers at all. Only one worker
+ can be used per index. So parallel workers are launched only when there
+ are at least <literal>2</literal> indexes in the table. Workers for
+ vacuum are launched before the start of each phase and exit at the end of
+ the phase. These behaviors might change in a future release.
+ </para>
+ </sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 80829b23945..e342585c7f0 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1738,6 +1738,22 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Per-table value for <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ parameter. If -1 is specified, <varname>autovacuum_max_parallel_workers</varname>
+ value will be used. If set to 0, parallel vacuum is disabled for
+ this table. The default value is -1.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index ac5d083d468..38ee973ea05 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -81,7 +81,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
is not obtained. However, extra space is not returned to the operating
system (in most cases); it's just kept available for re-use within the
same table. It also allows us to leverage multiple CPUs in order to process
- indexes. This feature is known as <firstterm>parallel vacuum</firstterm>.
+ indexes. This feature is known as <firstterm><xref linkend="parallel-vacuum"/></firstterm>.
To disable this feature, one can use <literal>PARALLEL</literal> option and
specify parallel workers as zero. <command>VACUUM FULL</command> rewrites
the entire contents of the table into a new disk file with no extra space,
@@ -266,24 +266,9 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<term><literal>PARALLEL</literal></term>
<listitem>
<para>
- Perform index vacuum and index cleanup phases of <command>VACUUM</command>
- in parallel using <replaceable class="parameter">integer</replaceable>
- background workers (for the details of each vacuum phase, please
- refer to <xref linkend="vacuum-phases"/>). The number of workers used
- to perform the operation is equal to the number of indexes on the
- relation that support parallel vacuum which is limited by the number of
- workers specified with <literal>PARALLEL</literal> option if any which is
- further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
- An index can participate in parallel vacuum if and only if the size of the
- index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
- Please note that it is not guaranteed that the number of parallel workers
- specified in <replaceable class="parameter">integer</replaceable> will be
- used during execution. It is possible for a vacuum to run with fewer
- workers than specified, or even with no workers at all. Only one worker
- can be used per index. So parallel workers are launched only when there
- are at least <literal>2</literal> indexes in the table. Workers for
- vacuum are launched before the start of each phase and exit at the end of
- the phase. These behaviors might change in a future release. This
+ Specifies the maximum number of parallel workers that can be used
+ for <xref linkend="parallel-vacuum"/>, which is further limited
+ by <xref linkend="guc-max-parallel-maintenance-workers"/>. This
option can't be used with the <literal>FULL</literal> option.
</para>
</listitem>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b41eafd7691..3e832c3797e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -236,6 +236,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1969,6 +1978,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 24001b27387..71f7493ff29 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -862,6 +863,17 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params.nworkers);
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause before parallel vacuum is launched, allowing
+ * test code to modify configuration that the leader then propagates to
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 766a518c7a1..f0d74870f93 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..683a0f34e24 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -16,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -37,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -51,6 +61,33 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Parallel
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters. The generation starts from 1 so that a freshly
+ * started worker (whose local copy is 0) will always load the initial
+ * parameters on its first check.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -120,6 +157,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenance VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Cost-based vacuum delay parameters shared between the autovacuum leader
+ * and its parallel workers.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -222,6 +271,17 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/*
+ * Worker-local copy of the last cost-parameter generation this worker has
+ * applied. Initialized to 0; since the leader initializes the shared
+ * generation counter to 1, the first call to
+ * parallel_vacuum_update_shared_delay_params() will always detect a
+ * mismatch and read the initial parameters from shared memory.
+ */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -233,6 +293,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -374,8 +435,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -392,6 +454,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 1);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -457,6 +534,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -534,6 +614,103 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel workers, this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the worker is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters have changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -555,12 +732,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +781,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -730,6 +912,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause after workers are launched but before index
+ * vacuuming begins.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1064,7 +1256,21 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
shared->dead_items_handle);
/* Set cost-based vacuum delay */
- VacuumUpdateCosts();
+ if (shared->is_autovacuum)
+ {
+ /*
+ * Parallel autovacuum workers initialize cost-based delay parameters
+ * from the leader's shared state rather than GUC defaults, because
+ * the leader may have applied per-table or autovacuum-specific
+ * overrides. pv_shared_cost_params must be set before calling
+ * parallel_vacuum_update_shared_delay_params().
+ */
+ pv_shared_cost_params = &(shared->cost_params);
+ parallel_vacuum_update_shared_delay_params();
+ }
+ else
+ VacuumUpdateCosts();
+
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(shared->cost_balance);
@@ -1119,6 +1325,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index d695f1de4bd..c6c4f0dbb55 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1688,7 +1688,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
@@ -2928,8 +2928,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2939,6 +2937,27 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ /* Determine the number of parallel vacuum workers to use */
+ tab->at_params.nworkers = 0;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers == 0)
+ {
+ /*
+ * Disable parallel vacuum, if the reloption sets the parallel
+ * degree as zero.
+ */
+ tab->at_params.nworkers = -1;
+ }
+ else if (avopts->autovacuum_parallel_workers > 0)
+ tab->at_params.nworkers = avopts->autovacuum_parallel_workers;
+
+ /*
+ * autovacuum_parallel_workers == -1 falls through, keep
+ * nworkers=0
+ */
+ }
+
/*
* Later, in vacuum_rel(), we check reloptions for any
* vacuum_max_eager_freeze_failure_rate override.
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e1546d9c97a..15048aa9e56 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3358,9 +3358,14 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception because only cost-based
+ * delays need to be affected to parallel autovacuum workers. These
+ * parameters are propagated to its workers during parallel vacuum (see
+ * vacuumparallel.c for details).
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0a862693fcd..6ef46d88155 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -170,6 +170,14 @@
max => '10.0',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cf15597385b..73c49f09aef 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index adcff1f6ffb..77098a6dd6f 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5d351f0df33..3e8bee024df 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -422,6 +422,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 04f29748be7..3eaa4655c88 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ int autovacuum_parallel_workers;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..336a212faf4 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..929659956cb 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..9e65eafb549
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,197 @@
+
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+# Test parallel autovacuum behavior
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table test_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql(
+ 'postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until(
+ 'postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Limit to one autovacuum worker and disable autovacuum logging globally
+# (enabled only on the test table) so that log checks below match only
+# activity on the expected table.
+$node->append_conf(
+ 'postgresql.conf', qq{
+autovacuum_max_workers = 1
+autovacuum_worker_slots = 1
+autovacuum_max_parallel_workers = 2
+max_worker_processes = 10
+max_parallel_workers = 10
+log_min_messages = debug2
+autovacuum_naptime = '1s'
+min_parallel_index_scan_size = 0
+log_autovacuum_min_duration = -1
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql(
+ 'postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing');
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 69a71db1496..3aba35c2157 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2097,6 +2097,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.53.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-31 07:46 SATYANARAYANA NARLAPURAM <[email protected]>
parent: Daniil Davydov <[email protected]>
2 siblings, 1 reply; 112+ messages in thread
From: SATYANARAYANA NARLAPURAM @ 2026-03-31 07:46 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi
On Mon, Mar 30, 2026 at 1:44 AM Daniil Davydov <[email protected]> wrote:
> Hi,
>
> On Mon, Mar 30, 2026 at 7:17 AM SATYANARAYANA NARLAPURAM
> <[email protected]> wrote:
> >
> > Thank you for working on this, very useful feature. Sharing a few
> thoughts:
> >
> > 1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM
> resources in parallel_vacuum_compute_workers?
>
> Actually, autovacuum_max_parallel_workers is already limited by
> max_parallel_workers. It is not clear for me why we allow setting this GUC
> higher than max_parallel_workers, but if this happens, I think it is a
> user's
> misconfiguration.
Isn’t there a wasted effort here if user misconfigures because anyway we
cannot launch that many workers? I suggest making a check here.
>
>
> > 2. Is it intentional that other autovacuum workers not yield cost limits
> to the parallel auto vacuum workers? Cost limits are distributed first
> equally to the autovacuum workers.
> > and then they share that. Therefore, parallel workers will be heavily
> throttled. IIUC, this problem doesn't exist with manual vacuum.
> > If we don't fix this, at least we should document this.
>
> Parallel a/v workers inherit cost based parameters (including the
> vacuum_cost_limit) from the leader worker. Do you mean that this can be too
> low value for parallel operation? If so, user can manually increase the
> vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too
> much (due to cost delay).
They don’t inherit but share, isn’t it?
>
> BTW, describing the cost limit propagation to the parallel a/v workers is
> worth mentioning in the documentation. I'll add it in the next patch
> version.
Yes, that helps
>
>
> > 3. Additionally, is there a point where, based on the cost limits,
> launching additional workers becomes counterproductive compared to running
> fewer workers and preventing it?
>
> I don't think that we can possibly find a universal limit that will be
> appropriate for all possible configurations. By now we are using a pretty
> simple formula for parallel degree calculation. Since user have several
> ways
> to affect this formula, I guess that there will be no problems with it
> (except
> my concerns about opt-out style).
Thanks,
Satya
>
>
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-31 13:26 Daniil Davydov <[email protected]>
parent: SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2026-03-31 13:26 UTC (permalink / raw)
To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; Masahiko Sawada <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Mar 31, 2026 at 2:46 PM SATYANARAYANA NARLAPURAM
<[email protected]> wrote:
>
> On Mon, Mar 30, 2026 at 1:44 AM Daniil Davydov <[email protected]> wrote:
>>
>> Actually, autovacuum_max_parallel_workers is already limited by
>> max_parallel_workers. It is not clear for me why we allow setting this GUC
>> higher than max_parallel_workers, but if this happens, I think it is a user's
>> misconfiguration.
>
> Isn’t there a wasted effort here if user misconfigures because anyway we cannot launch that many workers? I suggest making a check here.
We have a pretty long discussion about this in the above messages. I also
think that the user have too many ways to misconfigure postgres. But we
don't consider such misconfiguration as our fault.
>> Parallel a/v workers inherit cost based parameters (including the
>> vacuum_cost_limit) from the leader worker. Do you mean that this can be too
>> low value for parallel operation? If so, user can manually increase the
>> vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too
>> much (due to cost delay).
>
>
> They don’t inherit but share, isn’t it?
>
Yeah, let me clarify. At the beginning of parallel a/v, the leader a/v worker
creates and initializes a shared structure, where its cost based parameters
are stored. Then, all parallel workers will read them from shmem and update
their parameters accordingly.
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-31 14:18 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-03-31 14:18 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Mar 31, 2026 at 2:09 PM Masahiko Sawada <[email protected]> wrote:
>
> I've made some changes to the documentation part, merged two patches
> into one, and updated the commit message. Please review the attached
> patch.
>
Great, thank you very much!
Again, I don't know how to write the documentation well, so you can ignore
my comments :
> + <command>VACUUM</command> can perform index vacuuming and index cleanup
Don't we need to mention autovacuum here too? I thought that VACUUM in the
context means "manual VACUUM command".
> + ...applies specifically to the index vacuuming and index cleanup phases...
Maybe we can refer to "vacuum-phases" here?
All other changes look good to me.
!!!
> Searching for arguments in
> favor of opt-in style, I asked for help from another person who has been
> managing the setup of highload systems for decades. He promised to share his
> opinion next week.
I talked to Anton Doroshkevich today.
He confirmed that as a rule there are *hundreds of thousands* of tables in the
system, the vast majority of which do not need to be vacuumed in parallel mode.
He also suggested the following : let the reloption overlap the value of the
GUC parameter. I.e. even if av_max_parallel_workers parameters is 0 the user
still can set the av_parallel_workers to 10 for some table, and autovacuum
will process this table in parallel.
I remember that you want to use the GUC parameter as a global switch, and this
approach will break this logic. But according to Anton's words, it is okay if
the GUC parameter cannot disable parallel a/v for all tables instantly. It will
become an administrator's responsibility to manually turn off parallel a/v for
several tables (again, it is completely OK). Thus, this feature can be handy
for all use cases.
I hope it doesn't look like as an adapting to the needs of a specific user.
A lot of super-large productions are migrating to postgres now, and I believe
that we should ensure their comfort too.
What do you think? Can postgres have such a logic?
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-03-31 21:19 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-03-31 21:19 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Tue, Mar 31, 2026 at 7:18 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Tue, Mar 31, 2026 at 2:09 PM Masahiko Sawada <[email protected]> wrote:
> >
> > I've made some changes to the documentation part, merged two patches
> > into one, and updated the commit message. Please review the attached
> > patch.
> >
>
> Great, thank you very much!
>
> Again, I don't know how to write the documentation well, so you can ignore
> my comments :
>
> > + <command>VACUUM</command> can perform index vacuuming and index cleanup
> Don't we need to mention autovacuum here too? I thought that VACUUM in the
> context means "manual VACUUM command".
I think that the documentation explains that the autovacuum daemon is
a worker automatically executing VACUUM and ANALYZE commands.
>
> > + ...applies specifically to the index vacuuming and index cleanup phases...
> Maybe we can refer to "vacuum-phases" here?
Agreed.
>
> All other changes look good to me.
>
> !!!
> > Searching for arguments in
> > favor of opt-in style, I asked for help from another person who has been
> > managing the setup of highload systems for decades. He promised to share his
> > opinion next week.
>
> I talked to Anton Doroshkevich today.
Thank you for sharing!
> He confirmed that as a rule there are *hundreds of thousands* of tables in the
> system, the vast majority of which do not need to be vacuumed in parallel mode.
I'm still struggling to see the technical justification; why would a
user want to avoid parallel vacuuming on eligible tables if they have
already explicitly allowed the system to use more resources by setting
autovacuum_max_parallel_workers to >0? If resource contention occurs,
it is typically a sign that the global parameters need re-tuning. As I
mentioned, the same contention can occur even with an opt-in style if
multiple tables are manually configured.
Also, I'm concerned that opt-in style could confuse users since
parallel vacuum is enabled by default in VACUUM command.
> He also suggested the following : let the reloption overlap the value of the
> GUC parameter. I.e. even if av_max_parallel_workers parameters is 0 the user
> still can set the av_parallel_workers to 10 for some table, and autovacuum
> will process this table in parallel.
>
> I remember that you want to use the GUC parameter as a global switch, and this
> approach will break this logic. But according to Anton's words, it is okay if
> the GUC parameter cannot disable parallel a/v for all tables instantly. It will
> become an administrator's responsibility to manually turn off parallel a/v for
> several tables (again, it is completely OK). Thus, this feature can be handy
> for all use cases.
While some autovacuum parameters do override GUCs, those are typically
local to the process (like cost delay). Parallel workers, however, are
a shared system-wide resource. In a multi-tenant environment, allowing
a single table's reloption to bypass the
autovacuum_max_parallel_workers = 0 limit could lead to unexpected
exhaustion of the worker pool. I think that this GUC should act as a
reliable global switch for resource management.
> I hope it doesn't look like as an adapting to the needs of a specific user.
> A lot of super-large productions are migrating to postgres now, and I believe
> that we should ensure their comfort too.
I'm not prioritizing one specific use case over another. I believe
that there are also users who want to use parallel vacuum on hundreds
of thousands of tables. We should consider a better solution while
checking it from multiple perspectives such as the usability, the
robustness and consistency with the existing features and behaviors
etc.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-01 07:44 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Daniil Davydov @ 2026-04-01 07:44 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Wed, Apr 1, 2026 at 4:20 AM Masahiko Sawada <[email protected]> wrote:
>
> On Tue, Mar 31, 2026 at 7:18 AM Daniil Davydov <[email protected]> wrote:
> >
> > > + <command>VACUUM</command> can perform index vacuuming and index cleanup
> > Don't we need to mention autovacuum here too? I thought that VACUUM in the
> > context means "manual VACUUM command".
>
> I think that the documentation explains that the autovacuum daemon is
> a worker automatically executing VACUUM and ANALYZE commands.
>
Yeah, that's true. Then I agree with this change.
>
> > He confirmed that as a rule there are *hundreds of thousands* of tables in the
> > system, the vast majority of which do not need to be vacuumed in parallel mode.
>
> I'm still struggling to see the technical justification; why would a
> user want to avoid parallel vacuuming on eligible tables if they have
> already explicitly allowed the system to use more resources by setting
> autovacuum_max_parallel_workers to >0?
Here I am talking about "introductory data". I.e. the situation that the user
has before considering our parameter usage. Based on this situation, it seems
to me that not everyone will want to turn on parallel a/v (because of resource
shortage hazard).
> If resource contention occurs,
> it is typically a sign that the global parameters need re-tuning. As I
> mentioned, the same contention can occur even with an opt-in style if
> multiple tables are manually configured.
>
Yep, we already discussed it and I agree with you. I think that in the case of
opt-in style the resource contention will be much more controlled. But actually
the opt-in style in the form in which I originally proposed it, no longer seems
like a good idea to me. Classic opt-in style will deprive us of support for
half of the parallel a/v use cases. Anton's proposal seems to me like a good
balance between the two styles.
> > He also suggested the following : let the reloption overlap the value of the
> > GUC parameter. I.e. even if av_max_parallel_workers parameters is 0 the user
> > still can set the av_parallel_workers to 10 for some table, and autovacuum
> > will process this table in parallel.
> >
> > I remember that you want to use the GUC parameter as a global switch, and this
> > approach will break this logic. But according to Anton's words, it is okay if
> > the GUC parameter cannot disable parallel a/v for all tables instantly. It will
> > become an administrator's responsibility to manually turn off parallel a/v for
> > several tables (again, it is completely OK). Thus, this feature can be handy
> > for all use cases.
>
> While some autovacuum parameters do override GUCs, those are typically
> local to the process (like cost delay). Parallel workers, however, are
> a shared system-wide resource. In a multi-tenant environment, allowing
> a single table's reloption to bypass the
> autovacuum_max_parallel_workers = 0 limit could lead to unexpected
> exhaustion of the worker pool.
Will this exhaustion really be unexpected? If we describe such an ability in
the documentation, and the user uses it, then everything is fair. Even if
administrator forgets that he enabled av_parallel_workers reloption somewhere,
then he can :
1)
Check the logfile (if log level is not too high) searching for logs like
"parallel workers: index vacuum: N planned, N launched in total".
2)
Run a query that selects all tables which have av_parallel_workers > 0.
>I think that this GUC should act as a
> reliable global switch for resource management.
I agree that the "global switch" is an attractive idea and we should strive
for it. But our parameter *can* play the role of the switch if users don't
manually touch the av_parallel_workers reloption. But if they do - well, it is
their responsibility to turn the reloption off.
>
> > I hope it doesn't look like as an adapting to the needs of a specific user.
> > A lot of super-large productions are migrating to postgres now, and I believe
> > that we should ensure their comfort too.
>
> I'm not prioritizing one specific use case over another. I believe
> that there are also users who want to use parallel vacuum on hundreds
> of thousands of tables. We should consider a better solution while
> checking it from multiple perspectives such as the usability, the
> robustness and consistency with the existing features and behaviors
> etc.
For those users who want to use parallel a/v for hundreds of thousands of
tables we have the default value "-1" which allows parallel a/v everywhere via
GUC parameter manipulation.
For those users who want to parallel a/v on several specific tables we can
allow setting reloption that will override the GUC.
I guess that the question is : "Is it normal if the GUC parameter will lose
ability to turn off parallel a/v everywhere after the user has manually raised
the value for the av_parallel_workers reloption on a few tables?". If the
answer is "Yes", I don't see any obstacles for us to allow overriding the GUC
parameter via reloption.
Thank you very much for your comments!
Please, see an updated patch.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v34-v35-diff.patch (1.5K, 2-v34-v35-diff.patch)
download | inline diff:
From 0333e9294a2bd167d83263f6fdbd70c2182fc565 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Wed, 1 Apr 2026 14:24:03 +0700
Subject: [PATCH] v34--v35-diff
---
doc/src/sgml/config.sgml | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 70d8771f0af..520c38e82a6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -9497,9 +9497,10 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
<para>
Sets the maximum number of parallel workers that can be used by a
single autovacuum worker to process indexes. This limit applies
- specifically to the index vacuuming and index cleanup phases of
- autovacuum. The actual number of parallel workers is further limited
- by <xref linkend="guc-max-parallel-workers"/>. This is the
+ specifically to the index vacuuming and index cleanup phases (for the
+ details of each autovacuum phase, please refer to <xref linkend="vacuum-phases"/>).
+ The actual number of parallel workers is further limited by
+ <xref linkend="guc-max-parallel-workers"/>. This is the
per-autovacuum worker equivalent of the <literal>PARALLEL</literal>
option of the <link linkend="sql-vacuum"><command>VACUUM</command></link>
command. Setting this value to 0 disables parallel vacuum during autovacuum.
--
2.43.0
[text/x-patch] v35-0001-Allow-autovacuum-to-use-parallel-vacuum-workers.patch (42.5K, 3-v35-0001-Allow-autovacuum-to-use-parallel-vacuum-workers.patch)
download | inline diff:
From a115abdaab09d565ee49b9732f06ed6d9cb0cc27 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v35] Allow autovacuum to use parallel vacuum workers.
Previously, autovacuum always disabled parallel vacuum regardless of
the table's index count or configuration. This commit enables
autovacuum workers to use parallel index vacuuming and index cleanup,
using the same parallel vacuum infrastructure as manual VACUUM.
Two new configuration options control the feature. The GUC
autovacuum_max_parallel_workers sets the maximum number of parallel
workers a single autovacuum worker may launch; it defaults to 0,
preserving existing behavior unless explicitly enabled. The per-table
storage parameter autovacuum_parallel_workers provides per-table limits.
A value of 0 disables parallel vacuum for the table, a positive value
caps the worker count (still bounded by the GUC), and -1 (the default)
defers to the GUC.
To handle cases where autovacuum workers receive a SIGHUP and update
their cost-based vacuum delay parameters mid-operation, a new
propagation mechanism is added to vacuumparallel.c. The leader stores
its effective cost parameters in a DSM segment. Parallel vacuum
workers poll for changes in vacuum_delay_point(); if an update is
detected, they apply the new values locally via VacuumUpdateCosts().
A new test module, src/test/modules/test_autovacuum, is added to
verify that parallel autovacuum workers are correnctly launched and
that cost-parameter updates are propagated as expected.
Author: Daniil Davydov <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Sami Imseih <[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Reviewed-by: Bharath Rupireddy <[email protected]>
Reviewed-by: Alexander Korotkov <[email protected]>
Reviewed-by: zengman <[email protected]>
Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com
---
doc/src/sgml/config.sgml | 24 ++
doc/src/sgml/maintenance.sgml | 32 +++
doc/src/sgml/ref/create_table.sgml | 16 ++
doc/src/sgml/ref/vacuum.sgml | 23 +-
src/backend/access/common/reloptions.c | 11 +
src/backend/access/heap/vacuumlazy.c | 12 +
src/backend/commands/vacuum.c | 21 +-
src/backend/commands/vacuumparallel.c | 223 +++++++++++++++++-
src/backend/postmaster/autovacuum.c | 25 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 9 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/commands/vacuum.h | 2 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 2 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 197 ++++++++++++++++
src/tools/pgindent/typedefs.list | 1 +
24 files changed, 616 insertions(+), 33 deletions(-)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 229f41353eb..520c38e82a6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9485,6 +9486,29 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel workers that can be used by a
+ single autovacuum worker to process indexes. This limit applies
+ specifically to the index vacuuming and index cleanup phases (for the
+ details of each autovacuum phase, please refer to <xref linkend="vacuum-phases"/>).
+ The actual number of parallel workers is further limited by
+ <xref linkend="guc-max-parallel-workers"/>. This is the
+ per-autovacuum worker equivalent of the <literal>PARALLEL</literal>
+ option of the <link linkend="sql-vacuum"><command>VACUUM</command></link>
+ command. Setting this value to 0 disables parallel vacuum during autovacuum.
+ The default is 0.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0d2a28207ed..eb6a07e086d 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -1038,6 +1038,10 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
per-table <literal>autovacuum_vacuum_cost_delay</literal> or
<literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
are not considered in the balancing algorithm.
+ Parallel workers launched for <xref linkend="parallel-vacuum"/> are using
+ the same cost delay parameters as the leader worker. If any of these
+ parameters are changed in the leader worker, it will propagate the new
+ parameter values to all of its parallel workers.
</para>
<para>
@@ -1166,6 +1170,34 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
</sect3>
</sect2>
+
+ <sect2 id="parallel-vacuum" xreflabel="Parallel Vacuum">
+ <title>Parallel Vacuum</title>
+
+ <para>
+ <command>VACUUM</command> can perform index vacuuming and index cleanup
+ phases in parallel using background workers (for the details of each
+ vacuum phase, please refer to <xref linkend="vacuum-phases"/>). The
+ degree of parallelism is determined by the number of indexes on the
+ relation that support parallel vacuum, limited by the <literal>PARALLEL</literal>
+ (for manual <command>VACUUM</command>) or the
+ <xref linkend="guc-autovacuum-max-parallel-workers"/> parameters (for
+ autovacuum).
+ </para>
+
+ <para>
+ An index can participate in parallel vacuum if and only if the size of the
+ index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+ Please note that it is not guaranteed that the number of parallel workers
+ specified in <replaceable class="parameter">integer</replaceable> will be
+ used during execution. It is possible for a vacuum to run with fewer
+ workers than specified, or even with no workers at all. Only one worker
+ can be used per index. So parallel workers are launched only when there
+ are at least <literal>2</literal> indexes in the table. Workers for
+ vacuum are launched before the start of each phase and exit at the end of
+ the phase. These behaviors might change in a future release.
+ </para>
+ </sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 80829b23945..e342585c7f0 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1738,6 +1738,22 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Per-table value for <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ parameter. If -1 is specified, <varname>autovacuum_max_parallel_workers</varname>
+ value will be used. If set to 0, parallel vacuum is disabled for
+ this table. The default value is -1.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index ac5d083d468..38ee973ea05 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -81,7 +81,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
is not obtained. However, extra space is not returned to the operating
system (in most cases); it's just kept available for re-use within the
same table. It also allows us to leverage multiple CPUs in order to process
- indexes. This feature is known as <firstterm>parallel vacuum</firstterm>.
+ indexes. This feature is known as <firstterm><xref linkend="parallel-vacuum"/></firstterm>.
To disable this feature, one can use <literal>PARALLEL</literal> option and
specify parallel workers as zero. <command>VACUUM FULL</command> rewrites
the entire contents of the table into a new disk file with no extra space,
@@ -266,24 +266,9 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<term><literal>PARALLEL</literal></term>
<listitem>
<para>
- Perform index vacuum and index cleanup phases of <command>VACUUM</command>
- in parallel using <replaceable class="parameter">integer</replaceable>
- background workers (for the details of each vacuum phase, please
- refer to <xref linkend="vacuum-phases"/>). The number of workers used
- to perform the operation is equal to the number of indexes on the
- relation that support parallel vacuum which is limited by the number of
- workers specified with <literal>PARALLEL</literal> option if any which is
- further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
- An index can participate in parallel vacuum if and only if the size of the
- index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
- Please note that it is not guaranteed that the number of parallel workers
- specified in <replaceable class="parameter">integer</replaceable> will be
- used during execution. It is possible for a vacuum to run with fewer
- workers than specified, or even with no workers at all. Only one worker
- can be used per index. So parallel workers are launched only when there
- are at least <literal>2</literal> indexes in the table. Workers for
- vacuum are launched before the start of each phase and exit at the end of
- the phase. These behaviors might change in a future release. This
+ Specifies the maximum number of parallel workers that can be used
+ for <xref linkend="parallel-vacuum"/>, which is further limited
+ by <xref linkend="guc-max-parallel-maintenance-workers"/>. This
option can't be used with the <literal>FULL</literal> option.
</para>
</listitem>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b41eafd7691..3e832c3797e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -236,6 +236,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1969,6 +1978,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 88c71cd85b6..39395aed0d5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -862,6 +863,17 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params->nworkers);
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause before parallel vacuum is launched, allowing
+ * test code to modify configuration that the leader then propagates to
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0ed363d1c85..62d03c8e190 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..683a0f34e24 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -16,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -37,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -51,6 +61,33 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Parallel
+ * vacuum workers compares it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters. The generation starts from 1 so that a freshly
+ * started worker (whose local copy is 0) will always load the initial
+ * parameters on its first check.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -120,6 +157,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenance VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Cost-based vacuum delay parameters shared between the autovacuum leader
+ * and its parallel workers.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -222,6 +271,17 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/*
+ * Worker-local copy of the last cost-parameter generation this worker has
+ * applied. Initialized to 0; since the leader initializes the shared
+ * generation counter to 1, the first call to
+ * parallel_vacuum_update_shared_delay_params() will always detect a
+ * mismatch and read the initial parameters from shared memory.
+ */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -233,6 +293,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -374,8 +435,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -392,6 +454,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 1);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -457,6 +534,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -534,6 +614,103 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel workers, this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the worker is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters have changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -555,12 +732,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +781,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -730,6 +912,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause after workers are launched but before index
+ * vacuuming begins.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1064,7 +1256,21 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
shared->dead_items_handle);
/* Set cost-based vacuum delay */
- VacuumUpdateCosts();
+ if (shared->is_autovacuum)
+ {
+ /*
+ * Parallel autovacuum workers initialize cost-based delay parameters
+ * from the leader's shared state rather than GUC defaults, because
+ * the leader may have applied per-table or autovacuum-specific
+ * overrides. pv_shared_cost_params must be set before calling
+ * parallel_vacuum_update_shared_delay_params().
+ */
+ pv_shared_cost_params = &(shared->cost_params);
+ parallel_vacuum_update_shared_delay_params();
+ }
+ else
+ VacuumUpdateCosts();
+
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(shared->cost_balance);
@@ -1119,6 +1325,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6694f485216..878a11a6b4b 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1688,7 +1688,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
@@ -2928,8 +2928,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2939,6 +2937,27 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ /* Determine the number of parallel vacuum workers to use */
+ tab->at_params.nworkers = 0;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers == 0)
+ {
+ /*
+ * Disable parallel vacuum, if the reloption sets the parallel
+ * degree as zero.
+ */
+ tab->at_params.nworkers = -1;
+ }
+ else if (avopts->autovacuum_parallel_workers > 0)
+ tab->at_params.nworkers = avopts->autovacuum_parallel_workers;
+
+ /*
+ * autovacuum_parallel_workers == -1 falls through, keep
+ * nworkers=0
+ */
+ }
+
/*
* Later, in vacuum_rel(), we check reloptions for any
* vacuum_max_eager_freeze_failure_rate override.
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e1546d9c97a..15048aa9e56 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3358,9 +3358,14 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception because only cost-based
+ * delays need to be affected to parallel autovacuum workers. These
+ * parameters are propagated to its workers during parallel vacuum (see
+ * vacuumparallel.c for details).
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0a862693fcd..6ef46d88155 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -170,6 +170,14 @@
max => '10.0',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cf15597385b..73c49f09aef 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 53bf1e21721..1d941c11997 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5b8023616c0..69fec07491b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -422,6 +422,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 04f29748be7..3eaa4655c88 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ int autovacuum_parallel_workers;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..336a212faf4 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..929659956cb 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..9e65eafb549
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,197 @@
+
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+# Test parallel autovacuum behavior
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table test_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql(
+ 'postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until(
+ 'postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Limit to one autovacuum worker and disable autovacuum logging globally
+# (enabled only on the test table) so that log checks below match only
+# activity on the expected table.
+$node->append_conf(
+ 'postgresql.conf', qq{
+autovacuum_max_workers = 1
+autovacuum_worker_slots = 1
+autovacuum_max_parallel_workers = 2
+max_worker_processes = 10
+max_parallel_workers = 10
+log_min_messages = debug2
+autovacuum_naptime = '1s'
+min_parallel_index_scan_size = 0
+log_autovacuum_min_duration = -1
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql(
+ 'postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing');
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 8e9c06547d6..398c3f09f59 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2098,6 +2098,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-01 12:10 Alexander Korotkov <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Alexander Korotkov @ 2026-04-01 12:10 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi, Daniil!
On Wed, Apr 1, 2026 at 10:44 AM Daniil Davydov <[email protected]> wrote:
> Thank you very much for your comments!
> Please, see an updated patch.
Thank you for your work on this subject! I've some notes about the patch.
1) The changes in guc.c allows autovacuum parallel leader to accept
changes in not just cost-based GUCs, but any GUCs. That should be no
problem, because parallel workers have their own copies of GUC
variables, but I think this worth comment.
2) Maximum value for autovacuum_parallel_workers reloption is defined
as literally 1024, while max value for autovacuum_max_parallel_workers
is defined as MAX_PARALLEL_WORKER_LIMIT (also 1024). Should we define
max value for reloption as MAX_PARALLEL_WORKER_LIMIT as well?
3) Some paragraphs were moved from vacuum.sgml to maintenance.sgml.
It particular it references <replaceable
class="parameter">integer</replaceable, which is related to PARALLEL
option syntax: (PARALLEL integer). Now it becoming unclear and needs
to be revised.
4) I also think maintenance.sgml should mention the new reloption.
5) I think it worth having a test which check that setting
autovacuum_parallel_workers to 0 disables the parallel autovacuum for
given table.
6) Minor grammar issue in PVSharedCostParams comment, it must be
"vacuum workers compare" (plural subject).
------
Regards,
Alexander Korotkov
Supabase
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-01 18:54 Masahiko Sawada <[email protected]>
parent: SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 2 replies; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-01 18:54 UTC (permalink / raw)
To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: Daniil Davydov <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Mon, Mar 30, 2026 at 5:14 PM SATYANARAYANA NARLAPURAM
<[email protected]> wrote:
>
> Hi
>
> On Mon, Mar 30, 2026 at 1:44 AM Daniil Davydov <[email protected]> wrote:
>>
>> Hi,
>>
>> On Mon, Mar 30, 2026 at 7:17 AM SATYANARAYANA NARLAPURAM
>> <[email protected]> wrote:
>> >
>> > Thank you for working on this, very useful feature. Sharing a few thoughts:
>> >
>> > 1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM resources in parallel_vacuum_compute_workers?
>>
>> Actually, autovacuum_max_parallel_workers is already limited by
>> max_parallel_workers. It is not clear for me why we allow setting this GUC
>> higher than max_parallel_workers, but if this happens, I think it is a user's
>> misconfiguration.
>>
>> > 2. Is it intentional that other autovacuum workers not yield cost limits to the parallel auto vacuum workers? Cost limits are distributed first equally to the autovacuum workers.
>> > and then they share that. Therefore, parallel workers will be heavily throttled. IIUC, this problem doesn't exist with manual vacuum.
>> > If we don't fix this, at least we should document this.
>>
>> Parallel a/v workers inherit cost based parameters (including the
>> vacuum_cost_limit) from the leader worker. Do you mean that this can be too
>> low value for parallel operation? If so, user can manually increase the
>> vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too
>> much (due to cost delay).
>>
>> BTW, describing the cost limit propagation to the parallel a/v workers is
>> worth mentioning in the documentation. I'll add it in the next patch version.
>>
>> > 3. Additionally, is there a point where, based on the cost limits, launching additional workers becomes counterproductive compared to running fewer workers and preventing it?
>>
>> I don't think that we can possibly find a universal limit that will be
>> appropriate for all possible configurations. By now we are using a pretty
>> simple formula for parallel degree calculation. Since user have several ways
>> to affect this formula, I guess that there will be no problems with it (except
>> my concerns about opt-out style).
>>
>> > 4. Would it make sense to add a table level override to disable parallelism or set parallel worker count?
>>
>> We already have the "autovacuum_parallel_workers" reloption that is used as
>> an additional limit for the number of parallel workers. In particular, this
>> reloption can be used to disable parallelism at all.
>>
>> >
>> > I ran some perf tests to show the improvements with parallel vacuum and shared below.
>>
>> Thank you very much!
>>
>> > Observations:
>> >
>> > 1. Parallel autovacuum provides consistent speedup. With cost_limit=200 and
>> > 7 workers, vacuum completes 1.41x faster (71s -> 50s). With cost_limit=60,
>> > the speedup is 1.25x (194s -> 154s).
>> > 2. I see the benefit comes from parallelizing index vacuum. With 8 indexes totaling
>> > ~530 MB, parallel workers scan indexes concurrently instead of the leader
>> > scanning them one by one. The leader's CPU user time drops from ~3s to
>> > ~0.8s as index work is offloaded
>> >
>>
>> 1.41 speedup with 7 parallel workers may not seem like a great win, but it is
>> a whole time of autovacuum operation (not only index bulkdel/cleanup) with
>> pretty small indexes.
>>
>> May I ask you to run the same test with a higher table's size (several dozen
>> gigabytes)? I think the results will be more "expressive".
>
>
> I ran it with a Billion rows in a table with 8 indexes. The improvement with 7 workers is 1.8x.
> Please note that there is a fixed overhead in other vacuum steps, for example heap scan.
> In the environments where cost-based delay is used (the default), benefits will be modest
> unless vacuum_cost_delay is set to sufficiently large value.
>
> Hardware:
> CPU: Intel Xeon Platinum 8573C, 1 socket × 8 cores × 2 threads = 16 vCPUs
> RAM: 128 GB (131,900 MB)
> Swap: None
>
> Workload Description
>
> Table Schema:
> CREATE TABLE avtest (
> id bigint PRIMARY KEY,
> col1 int, -- random()*1e9
> col2 int, -- random()*1e9
> col3 int, -- random()*1e9
> col4 int, -- random()*1e9
> col5 int, -- random()*1e9
> col6 text, -- 'text_' || random()*1e6 (short text ~10 chars)
> col7 timestamp, -- now() - random()*365 days
> padding text -- repeat('x', 50)
> ) WITH (fillfactor = 90);
>
> Indexes (8 total):
> avtest_pkey — btree on (id) bigint
> idx_av_col1 — btree on (col1) int
> idx_av_col2 — btree on (col2) int
> idx_av_col3 — btree on (col3) int
> idx_av_col4 — btree on (col4) int
> idx_av_col5 — btree on (col5) int
> idx_av_col6 — btree on (col6) text
> idx_av_col7 — btree on (col7) timestamp
>
> Dead Tuple Generation:
> DELETE FROM avtest WHERE id % 5 IN (1, 2);
> This deletes exactly 40% of rows, uniformly distributed across all pages.
>
> Vacuum Trigger:
> Autovacuum is triggered naturally by lowering the threshold to 0 and setting
> scale_factor to a value that causes immediate launch after the DELETE.
>
> Worker Configurations Tested:
> 0 workers — leader-only vacuum (baseline, no parallelism)
> 2 workers — leader + 2 parallel workers (3 processes total)
> 4 workers — leader + 4 parallel workers (5 processes total)
> 7 workers — leader + 7 parallel workers (8 processes total, 1 per index)
>
> Dataset:
> Rows: 1,000,000,000
> Heap size: 139 GB
> Total size: 279 GB (heap + 8 indexes)
> Dead tuples: 400,000,000 (40%)
>
> Index Sizes:
> avtest_pkey 21 GB (bigint)
> idx_av_col7 21 GB (timestamp)
> idx_av_col1 18 GB (int)
> idx_av_col2 18 GB (int)
> idx_av_col3 18 GB (int)
> idx_av_col4 18 GB (int)
> idx_av_col5 18 GB (int)
> idx_av_col6 7 GB (text — shorter keys, smaller index)
> Total indexes: 139 GB
>
> Server Settings:
> shared_buffers = 96GB
> maintenance_work_mem = 1GB
> max_wal_size = 100GB
> checkpoint_timeout = 1h
> autovacuum_vacuum_cost_delay = 0ms (NO throttling)
> autovacuum_vacuum_cost_limit = 1000
>
>
> Summary:
>
> Workers Avg(s) Min(s) Max(s) Speedup Time Saved
> ------- ------ ------ ------ ------- ----------
> 0 1645.93 1645.01 1646.84 1.00x —
> 2 1276.35 1275.64 1277.05 1.29x 369.58s (6.2 min)
> 4 1052.62 1048.92 1056.32 1.56x 593.31s (9.9 min)
> 7 892.23 886.59 897.86 1.84x 753.70s (12.6 min)
>
Thank you for sharing the performance test results!
While the benchmark results look good to me, have you compared the
performance differences between parallel vacuum in the VACUUM command
(with the PARALLEL option) and parallel vacuum in autovacuum? Since
parallel autovacuum introduces some logic to check for delay parameter
updates, I thought it was worth verifying if this adds any overhead.
BTW, in my view, the most challenging part of this patch is the
propagation logic for vacuum delay parameters. This propagation is
necessary because, unlike manual VACUUM, autovacuum workers can reload
their configuration during operation. We must ensure that parallel
workers stay synchronized with these updated parameters.
The current patch implements this in vacuumparallel.c: the leader
shares delay parameters in DSM and updates them (if any vacuum delay
parameters are updated) after a config reload, while workers poll for
updates at every vacuum_delay_point() call to refresh their local
variables.
Another possible approach would be an event-driven model where the
leader notifies workers after updating shared parameters—for example,
by adding a shm_mq between the leader (as the sender) and each worker
(as the receiver).
I've compared these two ideas and opted for the former (polling).
While a polling approach could theoretically be costly, the current
implementation is self-contained within the parallel vacuum logic and
does not touch the core parallel query infrastructure. The
notification approach might look more elegant, but I'm concerned it
adds unnecessary complexity just for the autovacuum case. Since the
polling is essentially just checking an atomic variable, the overhead
should be negligible.
To verify this, I conducted benchmarks comparing the whole execution
time and index vacuuming duration.
Setup:
- Disabled (auto) vacuum delays and buffer usage limits.
- Parallel autovacuum with 1 worker on a table with 2 indexes (approx.
4 GB each).
- 5 runs.
Case 1: The latest patch (with polling)
Average: 3.95s (Index: 1.54s)
Median: 3.62s (Index: 1.37s)
Case 2: The latest patch without polling
Average: 3.98s (Index: 1.56s)
Median: 3.70s (Index: 1.40s)
Note that in order to simulate the code that doesn't have the polling,
I reverted the following change:
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
The parallel vacuum workers don't check the shared vacuum delay
parameter at all, which is still fine as I disabled vacuum delays.
Overall, the results show no noticeable overhead from the polling approach.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-01 21:24 Daniil Davydov <[email protected]>
parent: Alexander Korotkov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-04-01 21:24 UTC (permalink / raw)
To: Alexander Korotkov <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Wed, Apr 1, 2026 at 7:10 PM Alexander Korotkov <[email protected]> wrote:
>
> Thank you for your work on this subject! I've some notes about the patch.
>
Thank you very much for the review!
> 1) The changes in guc.c allows autovacuum parallel leader to accept
> changes in not just cost-based GUCs, but any GUCs. That should be no
> problem, because parallel workers have their own copies of GUC
> variables, but I think this worth comment.
OK, I will clarify it in the code.
> 2) Maximum value for autovacuum_parallel_workers reloption is defined
> as literally 1024, while max value for autovacuum_max_parallel_workers
> is defined as MAX_PARALLEL_WORKER_LIMIT (also 1024). Should we define
> max value for reloption as MAX_PARALLEL_WORKER_LIMIT as well?
I agree.
> 3) Some paragraphs were moved from vacuum.sgml to maintenance.sgml.
> It particular it references <replaceable
> class="parameter">integer</replaceable, which is related to PARALLEL
> option syntax: (PARALLEL integer). Now it becoming unclear and needs
> to be revised.
Good catch! You are right.
> 4) I also think maintenance.sgml should mention the new reloption.
Do you mean that we should mention it in the "parallel-vacuum" chapter? If so,
I think that we should also mention that max_parallel_maintenance_workers can
affect the parallel degree of manual VACUUM command. Yes, we have already
written about this in the description of the PARALLEL option. But now the
"vacuum-parallel" chapter doesn't mention limiting by GUC for manual VACUUM and
limiting by reloption for autovacuum. IMHO it is better to have redundancy than
an incomplete description.
> 5) I think it worth having a test which check that setting
> autovacuum_parallel_workers to 0 disables the parallel autovacuum for
> given table.
I see that VACUUM (PARALLEL) doesn't have such a test. Both manual VACUUM and
autovacuum have similar logic with parallelism disabling. Is the increase in
test completion time really worth checking these logic? I don't mind adding a
new test, actually. Just want to make sure that this is necessary.
> 6) Minor grammar issue in PVSharedCostParams comment, it must be
> "vacuum workers compare" (plural subject).
Yep, I'll fix it.
Please, see an updated patch.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] v36-0001-Allow-autovacuum-to-use-parallel-vacuum-workers.patch (42.9K, 2-v36-0001-Allow-autovacuum-to-use-parallel-vacuum-workers.patch)
download | inline diff:
From a7d6bba9052885896fbe2ed7a60187dceca69547 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v36] Allow autovacuum to use parallel vacuum workers.
Previously, autovacuum always disabled parallel vacuum regardless of
the table's index count or configuration. This commit enables
autovacuum workers to use parallel index vacuuming and index cleanup,
using the same parallel vacuum infrastructure as manual VACUUM.
Two new configuration options control the feature. The GUC
autovacuum_max_parallel_workers sets the maximum number of parallel
workers a single autovacuum worker may launch; it defaults to 0,
preserving existing behavior unless explicitly enabled. The per-table
storage parameter autovacuum_parallel_workers provides per-table limits.
A value of 0 disables parallel vacuum for the table, a positive value
caps the worker count (still bounded by the GUC), and -1 (the default)
defers to the GUC.
To handle cases where autovacuum workers receive a SIGHUP and update
their cost-based vacuum delay parameters mid-operation, a new
propagation mechanism is added to vacuumparallel.c. The leader stores
its effective cost parameters in a DSM segment. Parallel vacuum
workers poll for changes in vacuum_delay_point(); if an update is
detected, they apply the new values locally via VacuumUpdateCosts().
A new test module, src/test/modules/test_autovacuum, is added to
verify that parallel autovacuum workers are correnctly launched and
that cost-parameter updates are propagated as expected.
Author: Daniil Davydov <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Sami Imseih <[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Reviewed-by: Bharath Rupireddy <[email protected]>
Reviewed-by: Alexander Korotkov <[email protected]>
Reviewed-by: zengman <[email protected]>
Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com
---
doc/src/sgml/config.sgml | 24 ++
doc/src/sgml/maintenance.sgml | 33 +++
doc/src/sgml/ref/create_table.sgml | 16 ++
doc/src/sgml/ref/vacuum.sgml | 23 +-
src/backend/access/common/reloptions.c | 12 +
src/backend/access/heap/vacuumlazy.c | 12 +
src/backend/commands/vacuum.c | 21 +-
src/backend/commands/vacuumparallel.c | 223 +++++++++++++++++-
src/backend/postmaster/autovacuum.c | 25 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 10 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/commands/vacuum.h | 2 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 2 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 197 ++++++++++++++++
src/tools/pgindent/typedefs.list | 1 +
24 files changed, 619 insertions(+), 33 deletions(-)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 229f41353eb..520c38e82a6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9485,6 +9486,29 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel workers that can be used by a
+ single autovacuum worker to process indexes. This limit applies
+ specifically to the index vacuuming and index cleanup phases (for the
+ details of each autovacuum phase, please refer to <xref linkend="vacuum-phases"/>).
+ The actual number of parallel workers is further limited by
+ <xref linkend="guc-max-parallel-workers"/>. This is the
+ per-autovacuum worker equivalent of the <literal>PARALLEL</literal>
+ option of the <link linkend="sql-vacuum"><command>VACUUM</command></link>
+ command. Setting this value to 0 disables parallel vacuum during autovacuum.
+ The default is 0.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0d2a28207ed..884ac898065 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -1038,6 +1038,10 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
per-table <literal>autovacuum_vacuum_cost_delay</literal> or
<literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
are not considered in the balancing algorithm.
+ Parallel workers launched for <xref linkend="parallel-vacuum"/> are using
+ the same cost delay parameters as the leader worker. If any of these
+ parameters are changed in the leader worker, it will propagate the new
+ parameter values to all of its parallel workers.
</para>
<para>
@@ -1166,6 +1170,35 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
</sect3>
</sect2>
+
+ <sect2 id="parallel-vacuum" xreflabel="Parallel Vacuum">
+ <title>Parallel Vacuum</title>
+
+ <para>
+ <command>VACUUM</command> can perform index vacuuming and index cleanup
+ phases in parallel using background workers (for the details of each
+ vacuum phase, please refer to <xref linkend="vacuum-phases"/>). The
+ degree of parallelism is determined by the number of indexes on the
+ relation that support parallel vacuum. For manual <command>VACUUM</command>
+ it is limited by the <literal>PARALLEL</literal> option if any which is
+ further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
+ For autovacuum it is limited by the <xref linkend="guc-autovacuum-max-workers"/>
+ reloption if specified which is further limited by
+ <xref linkend="guc-autovacuum-max-parallel-workers"/> parameter. Please
+ note that it is not guaranteed that the number of parallel workers that was
+ calculated will be used during execution. It is possible for a vacuum to
+ run with fewer workers than specified, or even with no workers at all.
+ </para>
+
+ <para>
+ An index can participate in parallel vacuum if and only if the size of the
+ index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+ Only one worker can be used per index. So parallel workers are launched
+ only when there are at least <literal>2</literal> indexes in the table.
+ Workers for vacuum are launched before the start of each phase and exit at
+ the end of the phase. These behaviors might change in a future release.
+ </para>
+ </sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 80829b23945..e342585c7f0 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1738,6 +1738,22 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Per-table value for <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ parameter. If -1 is specified, <varname>autovacuum_max_parallel_workers</varname>
+ value will be used. If set to 0, parallel vacuum is disabled for
+ this table. The default value is -1.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index ac5d083d468..38ee973ea05 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -81,7 +81,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
is not obtained. However, extra space is not returned to the operating
system (in most cases); it's just kept available for re-use within the
same table. It also allows us to leverage multiple CPUs in order to process
- indexes. This feature is known as <firstterm>parallel vacuum</firstterm>.
+ indexes. This feature is known as <firstterm><xref linkend="parallel-vacuum"/></firstterm>.
To disable this feature, one can use <literal>PARALLEL</literal> option and
specify parallel workers as zero. <command>VACUUM FULL</command> rewrites
the entire contents of the table into a new disk file with no extra space,
@@ -266,24 +266,9 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<term><literal>PARALLEL</literal></term>
<listitem>
<para>
- Perform index vacuum and index cleanup phases of <command>VACUUM</command>
- in parallel using <replaceable class="parameter">integer</replaceable>
- background workers (for the details of each vacuum phase, please
- refer to <xref linkend="vacuum-phases"/>). The number of workers used
- to perform the operation is equal to the number of indexes on the
- relation that support parallel vacuum which is limited by the number of
- workers specified with <literal>PARALLEL</literal> option if any which is
- further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
- An index can participate in parallel vacuum if and only if the size of the
- index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
- Please note that it is not guaranteed that the number of parallel workers
- specified in <replaceable class="parameter">integer</replaceable> will be
- used during execution. It is possible for a vacuum to run with fewer
- workers than specified, or even with no workers at all. Only one worker
- can be used per index. So parallel workers are launched only when there
- are at least <literal>2</literal> indexes in the table. Workers for
- vacuum are launched before the start of each phase and exit at the end of
- the phase. These behaviors might change in a future release. This
+ Specifies the maximum number of parallel workers that can be used
+ for <xref linkend="parallel-vacuum"/>, which is further limited
+ by <xref linkend="guc-max-parallel-maintenance-workers"/>. This
option can't be used with the <literal>FULL</literal> option.
</para>
</listitem>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b41eafd7691..d675d52531c 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -28,6 +28,7 @@
#include "commands/defrem.h"
#include "commands/tablespace.h"
#include "nodes/makefuncs.h"
+#include "postmaster/bgworker_internals.h"
#include "storage/lock.h"
#include "utils/array.h"
#include "utils/attoptcache.h"
@@ -236,6 +237,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, MAX_PARALLEL_WORKER_LIMIT
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1969,6 +1979,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 88c71cd85b6..39395aed0d5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -862,6 +863,17 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params->nworkers);
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause before parallel vacuum is launched, allowing
+ * test code to modify configuration that the leader then propagates to
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0ed363d1c85..62d03c8e190 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,19 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..bac3bd28214 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -16,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -37,6 +46,7 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -51,6 +61,33 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Parallel
+ * vacuum workers compare it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters. The generation starts from 1 so that a freshly
+ * started worker (whose local copy is 0) will always load the initial
+ * parameters on its first check.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -120,6 +157,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenance VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Cost-based vacuum delay parameters shared between the autovacuum leader
+ * and its parallel workers.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -222,6 +271,17 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/*
+ * Worker-local copy of the last cost-parameter generation this worker has
+ * applied. Initialized to 0; since the leader initializes the shared
+ * generation counter to 1, the first call to
+ * parallel_vacuum_update_shared_delay_params() will always detect a
+ * mismatch and read the initial parameters from shared memory.
+ */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -233,6 +293,7 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -374,8 +435,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -392,6 +454,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 1);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -457,6 +534,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
@@ -534,6 +614,103 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel workers, this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the worker is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters have changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -555,12 +732,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +781,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -730,6 +912,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause after workers are launched but before index
+ * vacuuming begins.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
+#endif
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1064,7 +1256,21 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
shared->dead_items_handle);
/* Set cost-based vacuum delay */
- VacuumUpdateCosts();
+ if (shared->is_autovacuum)
+ {
+ /*
+ * Parallel autovacuum workers initialize cost-based delay parameters
+ * from the leader's shared state rather than GUC defaults, because
+ * the leader may have applied per-table or autovacuum-specific
+ * overrides. pv_shared_cost_params must be set before calling
+ * parallel_vacuum_update_shared_delay_params().
+ */
+ pv_shared_cost_params = &(shared->cost_params);
+ parallel_vacuum_update_shared_delay_params();
+ }
+ else
+ VacuumUpdateCosts();
+
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(shared->cost_balance);
@@ -1119,6 +1325,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6694f485216..878a11a6b4b 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1688,7 +1688,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
@@ -2928,8 +2928,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2939,6 +2937,27 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ /* Determine the number of parallel vacuum workers to use */
+ tab->at_params.nworkers = 0;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers == 0)
+ {
+ /*
+ * Disable parallel vacuum, if the reloption sets the parallel
+ * degree as zero.
+ */
+ tab->at_params.nworkers = -1;
+ }
+ else if (avopts->autovacuum_parallel_workers > 0)
+ tab->at_params.nworkers = avopts->autovacuum_parallel_workers;
+
+ /*
+ * autovacuum_parallel_workers == -1 falls through, keep
+ * nworkers=0
+ */
+ }
+
/*
* Later, in vacuum_rel(), we check reloptions for any
* vacuum_max_eager_freeze_failure_rate override.
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e1546d9c97a..c4c3fbc4fe3 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3358,9 +3358,15 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception because cost-based
+ * delays need to be affected to parallel autovacuum workers. These
+ * parameters are propagated to its workers during parallel vacuum (see
+ * vacuumparallel.c for details). All other changes will affect only the
+ * parallel autovacuum leader.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0a862693fcd..6ef46d88155 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -170,6 +170,14 @@
max => '10.0',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cf15597385b..73c49f09aef 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 53bf1e21721..1d941c11997 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5b8023616c0..69fec07491b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -422,6 +422,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 04f29748be7..3eaa4655c88 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ int autovacuum_parallel_workers;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..336a212faf4 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..929659956cb 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..188ec9f96a2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..9e65eafb549
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,197 @@
+
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+# Test parallel autovacuum behavior
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table test_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql(
+ 'postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until(
+ 'postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Limit to one autovacuum worker and disable autovacuum logging globally
+# (enabled only on the test table) so that log checks below match only
+# activity on the expected table.
+$node->append_conf(
+ 'postgresql.conf', qq{
+autovacuum_max_workers = 1
+autovacuum_worker_slots = 1
+autovacuum_max_parallel_workers = 2
+max_worker_processes = 10
+max_parallel_workers = 10
+log_min_messages = debug2
+autovacuum_naptime = '1s'
+min_parallel_index_scan_size = 0
+log_autovacuum_min_duration = -1
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql(
+ 'postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+ SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER SYSTEM SET vacuum_cost_limit = 500;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-leader-before-indexes-processing');
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
+});
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+ SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 8e9c06547d6..398c3f09f59 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2098,6 +2098,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.43.0
[text/x-patch] v35-v36-diff.patch (5.3K, 3-v35-v36-diff.patch)
download | inline diff:
From 7f5a774d4262e10be077bed8cb6d5808ceb7a7d2 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 2 Apr 2026 04:20:35 +0700
Subject: [PATCH] v35--v36-diff
---
doc/src/sgml/maintenance.sgml | 25 +++++++++++++------------
src/backend/access/common/reloptions.c | 3 ++-
src/backend/commands/vacuumparallel.c | 2 +-
src/backend/utils/misc/guc.c | 5 +++--
4 files changed, 19 insertions(+), 16 deletions(-)
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index eb6a07e086d..884ac898065 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -1179,23 +1179,24 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
phases in parallel using background workers (for the details of each
vacuum phase, please refer to <xref linkend="vacuum-phases"/>). The
degree of parallelism is determined by the number of indexes on the
- relation that support parallel vacuum, limited by the <literal>PARALLEL</literal>
- (for manual <command>VACUUM</command>) or the
- <xref linkend="guc-autovacuum-max-parallel-workers"/> parameters (for
- autovacuum).
+ relation that support parallel vacuum. For manual <command>VACUUM</command>
+ it is limited by the <literal>PARALLEL</literal> option if any which is
+ further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
+ For autovacuum it is limited by the <xref linkend="guc-autovacuum-max-workers"/>
+ reloption if specified which is further limited by
+ <xref linkend="guc-autovacuum-max-parallel-workers"/> parameter. Please
+ note that it is not guaranteed that the number of parallel workers that was
+ calculated will be used during execution. It is possible for a vacuum to
+ run with fewer workers than specified, or even with no workers at all.
</para>
<para>
An index can participate in parallel vacuum if and only if the size of the
index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
- Please note that it is not guaranteed that the number of parallel workers
- specified in <replaceable class="parameter">integer</replaceable> will be
- used during execution. It is possible for a vacuum to run with fewer
- workers than specified, or even with no workers at all. Only one worker
- can be used per index. So parallel workers are launched only when there
- are at least <literal>2</literal> indexes in the table. Workers for
- vacuum are launched before the start of each phase and exit at the end of
- the phase. These behaviors might change in a future release.
+ Only one worker can be used per index. So parallel workers are launched
+ only when there are at least <literal>2</literal> indexes in the table.
+ Workers for vacuum are launched before the start of each phase and exit at
+ the end of the phase. These behaviors might change in a future release.
</para>
</sect2>
</sect1>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 3e832c3797e..d675d52531c 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -28,6 +28,7 @@
#include "commands/defrem.h"
#include "commands/tablespace.h"
#include "nodes/makefuncs.h"
+#include "postmaster/bgworker_internals.h"
#include "storage/lock.h"
#include "utils/array.h"
#include "utils/attoptcache.h"
@@ -243,7 +244,7 @@ static relopt_int intRelOpts[] =
RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock
},
- -1, -1, 1024
+ -1, -1, MAX_PARALLEL_WORKER_LIMIT
},
{
{
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 683a0f34e24..bac3bd28214 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -70,7 +70,7 @@ typedef struct PVSharedCostParams
/*
* The generation counter is incremented by the leader process each time
* it updates the shared cost-based vacuum delay parameters. Parallel
- * vacuum workers compares it with their local generation,
+ * vacuum workers compare it with their local generation,
* shared_params_generation_local, to detect whether they need to refresh
* their local parameters. The generation starts from 1 so that a freshly
* started worker (whose local copy is 0) will always load the initial
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 15048aa9e56..c4c3fbc4fe3 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3359,10 +3359,11 @@ set_config_with_handle(const char *name, config_handle *handle,
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
* Other changes might need to affect other workers, so forbid them. Note,
- * that parallel autovacuum leader is an exception because only cost-based
+ * that parallel autovacuum leader is an exception because cost-based
* delays need to be affected to parallel autovacuum workers. These
* parameters are propagated to its workers during parallel vacuum (see
- * vacuumparallel.c for details).
+ * vacuumparallel.c for details). All other changes will affect only the
+ * parallel autovacuum leader.
*/
if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
action != GUC_ACTION_SAVE &&
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-01 21:43 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-04-01 21:43 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Thu, Apr 2, 2026 at 1:55 AM Masahiko Sawada <[email protected]> wrote:
>
> Overall, the results show no noticeable overhead from the polling approach.
Great news! Thank you for these measurements!
BTW, I caught myself thinking that Tom Lane and maybe some other people might
not like our parameter propagation logic. We are not building any new
capability, but supplying an utilitarian solution for a single feature.
Perhaps someone will not consider this a good way to develop new features.
However, I don't think that this is something bad. We have a pretty simple
logic which does not interfere with some other infrastructure. On the other
hand, maybe I am thinking in terms of bigtech product development, where
results (but not the design) are often the most important thing.
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-01 23:15 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-01 23:15 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed, Apr 1, 2026 at 2:24 PM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Wed, Apr 1, 2026 at 7:10 PM Alexander Korotkov <[email protected]> wrote:
> >
> > Thank you for your work on this subject! I've some notes about the patch.
> >
>
> Thank you very much for the review!
>
> > 1) The changes in guc.c allows autovacuum parallel leader to accept
> > changes in not just cost-based GUCs, but any GUCs. That should be no
> > problem, because parallel workers have their own copies of GUC
> > variables, but I think this worth comment.
>
> OK, I will clarify it in the code.
>
> > 2) Maximum value for autovacuum_parallel_workers reloption is defined
> > as literally 1024, while max value for autovacuum_max_parallel_workers
> > is defined as MAX_PARALLEL_WORKER_LIMIT (also 1024). Should we define
> > max value for reloption as MAX_PARALLEL_WORKER_LIMIT as well?
>
> I agree.
>
> > 3) Some paragraphs were moved from vacuum.sgml to maintenance.sgml.
> > It particular it references <replaceable
> > class="parameter">integer</replaceable, which is related to PARALLEL
> > option syntax: (PARALLEL integer). Now it becoming unclear and needs
> > to be revised.
>
> Good catch! You are right.
>
> > 4) I also think maintenance.sgml should mention the new reloption.
>
> Do you mean that we should mention it in the "parallel-vacuum" chapter? If so,
> I think that we should also mention that max_parallel_maintenance_workers can
> affect the parallel degree of manual VACUUM command. Yes, we have already
> written about this in the description of the PARALLEL option. But now the
> "vacuum-parallel" chapter doesn't mention limiting by GUC for manual VACUUM and
> limiting by reloption for autovacuum. IMHO it is better to have redundancy than
> an incomplete description.
>
> > 5) I think it worth having a test which check that setting
> > autovacuum_parallel_workers to 0 disables the parallel autovacuum for
> > given table.
>
> I see that VACUUM (PARALLEL) doesn't have such a test. Both manual VACUUM and
> autovacuum have similar logic with parallelism disabling. Is the increase in
> test completion time really worth checking these logic? I don't mind adding a
> new test, actually. Just want to make sure that this is necessary.
>
> > 6) Minor grammar issue in PVSharedCostParams comment, it must be
> > "vacuum workers compare" (plural subject).
>
> Yep, I'll fix it.
>
>
> Please, see an updated patch.
Thank you for updating the patch! I found a bug in the following code:
@@ -457,6 +534,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs,
IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
If an autovacuum worker raises an error during parallel vacuum, it
doesn't pv_shared_cost_params. Then, if it doesn't use parallel vacuum
on the next table to vacuum, it would end up with SEGV as it attempts
to propagate the vacuum delay parameters.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-02 09:22 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-02 09:22 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed, Apr 1, 2026 at 2:43 PM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Thu, Apr 2, 2026 at 1:55 AM Masahiko Sawada <[email protected]> wrote:
> >
> > Overall, the results show no noticeable overhead from the polling approach.
>
> Great news! Thank you for these measurements!
>
> BTW, I caught myself thinking that Tom Lane and maybe some other people might
> not like our parameter propagation logic. We are not building any new
> capability, but supplying an utilitarian solution for a single feature.
> Perhaps someone will not consider this a good way to develop new features.
Agreed, this is the one of the reasons why I summarized and conducted
the performance evaluation.
I've also experimented with the idea of using proc signal to propagate
the vacuum delay parameters update. The attached patch can be aplied
on top of v36 patch. While it requires adding a new function to
parallel.c and requires a bit more codes, it uses the exisiting
infrastructure for the propagation. Given that this propagation logic
works only when parallel vacuum workers are running, testing the
propagation logic in TAP tests using injection points becomes
complicated, so I removed. What do you think about this idea?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-patch] 0001-POC-use-proc-signals-to-propagate-the-shared-vacuum-.patch (21.9K, 2-0001-POC-use-proc-signals-to-propagate-the-shared-vacuum-.patch)
download | inline diff:
From f90a43e0a6297511144f2b31b322e508bcdc3369 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Thu, 2 Apr 2026 02:07:23 -0700
Subject: [PATCH] POC: use proc signals to propagate the shared vacuum delay
parameters.
---
src/backend/access/heap/vacuumlazy.c | 12 --
src/backend/access/transam/parallel.c | 20 +++
src/backend/commands/vacuum.c | 38 +++--
src/backend/commands/vacuumparallel.c | 146 ++++++++----------
src/backend/storage/ipc/procsignal.c | 4 +
src/include/access/parallel.h | 2 +
src/include/commands/vacuum.h | 5 +
src/include/storage/procsignal.h | 4 +-
src/test/modules/test_autovacuum/Makefile | 4 -
src/test/modules/test_autovacuum/meson.build | 3 -
.../t/001_parallel_autovacuum.pl | 78 ----------
11 files changed, 122 insertions(+), 194 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 39395aed0d5..88c71cd85b6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,7 +152,6 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
-#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -863,17 +862,6 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params->nworkers);
-#ifdef USE_INJECTION_POINTS
-
- /*
- * Used by tests to pause before parallel vacuum is launched, allowing
- * test code to modify configuration that the leader then propagates to
- * workers.
- */
- if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
- INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
-#endif
-
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ab1dfb30e73..a8ba09832fc 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1035,6 +1035,26 @@ ParallelContextActive(void)
return !dlist_is_empty(&pcxt_list);
}
+/*
+ * Send a signal to all workers.
+ */
+void
+SendProcSignalToParallelWorkers(ParallelContext *pcxt, ProcSignalReason reason)
+{
+ for (int i = 0; i < pcxt->nworkers_launched; i++)
+ {
+ pid_t pid;
+
+ if (pcxt->worker[i].error_mqh == NULL ||
+ pcxt->worker[i].bgwhandle == NULL ||
+ GetBackgroundWorkerPid(pcxt->worker[i].bgwhandle,
+ &pid) != BGWH_STARTED)
+ continue;
+
+ (void) SendProcSignal(pid, reason, ParallelLeaderProcNumber);
+ }
+}
+
/*
* Handle receipt of an interrupt indicating a parallel worker message.
*
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 62d03c8e190..2bf53debc4e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -621,6 +621,9 @@ vacuum(List *relations, const VacuumParams *params, BufferAccessStrategy bstrate
VacuumSharedCostBalance = NULL;
VacuumActiveNWorkers = NULL;
+ ParallelVacuumSharedCostParams = NULL;
+ CurrentParallelVacuumState = NULL;
+
/*
* Loop to process each selected relation.
*/
@@ -2435,19 +2438,8 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending)
- return;
-
- if (IsParallelWorker())
- {
- /*
- * Update cost-based vacuum delay parameters for a parallel autovacuum
- * worker if any changes are detected.
- */
- parallel_vacuum_update_shared_delay_params();
- }
-
- if (!VacuumCostActive && !ConfigReloadPending)
+ if (InterruptPending ||
+ (!VacuumCostActive && !ConfigReloadPending && !ParallelVacuumParamsUpdatePending))
return;
/*
@@ -2469,6 +2461,26 @@ vacuum_delay_point(bool is_analyze)
parallel_vacuum_propagate_shared_delay_params();
}
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if requested from the autovacuum worker (the leader process).
+ */
+ if (ParallelVacuumParamsUpdatePending && ParallelVacuumSharedCostParams != NULL)
+ {
+ Assert(IsParallelWorker());
+
+ ParallelVacuumParamsUpdatePending = false;
+ parallel_vacuum_update_shared_delay_params();
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+ }
+
/*
* If we disabled cost-based delays after reloading the config file,
* return.
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bac3bd28214..c3d7ad2366e 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -21,9 +21,8 @@
* For parallel autovacuum, we need to propagate cost-based vacuum delay
* parameters from the leader to its workers, as the leader's parameters can
* change even while processing a table (e.g., due to a config reload).
- * The PVSharedCostParams struct manages these parameters using a
- * generation counter. Each parallel worker polls this shared state and
- * refreshes its local delay parameters whenever a change is detected.
+ * After the autovacuum process updates any vacuum delay parameters, it
+ * notifies its parallel vacuum workers to refresh their local parameters.
*
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -46,7 +45,6 @@
#include "storage/bufmgr.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
-#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -67,17 +65,6 @@
*/
typedef struct PVSharedCostParams
{
- /*
- * The generation counter is incremented by the leader process each time
- * it updates the shared cost-based vacuum delay parameters. Parallel
- * vacuum workers compare it with their local generation,
- * shared_params_generation_local, to detect whether they need to refresh
- * their local parameters. The generation starts from 1 so that a freshly
- * started worker (whose local copy is 0) will always load the initial
- * parameters on its first check.
- */
- pg_atomic_uint32 generation;
-
slock_t mutex; /* protects all fields below */
/* Parameters to share with parallel workers */
@@ -271,16 +258,21 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
-static PVSharedCostParams *pv_shared_cost_params = NULL;
+/*
+ * ParallelVacuumSharedCostParams and CurrentParallelVacuumState are
+ * used only for an autovacuum worker to propagate the shared vacuum
+ * delay parameters to its parallel vacuum workers.
+ *
+ * CurrentParallelVacuumState is set only in the leader process (i.e.,
+ * autovacuum worker processes).
+ */
+PVSharedCostParams *ParallelVacuumSharedCostParams = NULL;
+ParallelVacuumState *CurrentParallelVacuumState = NULL;
/*
- * Worker-local copy of the last cost-parameter generation this worker has
- * applied. Initialized to 0; since the leader initializes the shared
- * generation counter to 1, the first call to
- * parallel_vacuum_update_shared_delay_params() will always detect a
- * mismatch and read the initial parameters from shared memory.
+ * Is there an update on the shared vacuum delay parameters?
*/
-static uint32 shared_params_generation_local = 0;
+volatile sig_atomic_t ParallelVacuumParamsUpdatePending = false;
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
@@ -463,10 +455,10 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
if (shared->is_autovacuum)
{
parallel_vacuum_set_cost_parameters(&shared->cost_params);
- pg_atomic_init_u32(&shared->cost_params.generation, 1);
SpinLockInit(&shared->cost_params.mutex);
- pv_shared_cost_params = &(shared->cost_params);
+ ParallelVacuumSharedCostParams = &(shared->cost_params);
+ CurrentParallelVacuumState = pvs;
}
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
@@ -535,7 +527,10 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
ExitParallelMode();
if (AmAutoVacuumWorkerProcess())
- pv_shared_cost_params = NULL;
+ {
+ ParallelVacuumSharedCostParams = NULL;
+ CurrentParallelVacuumState = NULL;
+ }
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
@@ -627,6 +622,23 @@ parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
params->cost_page_miss = VacuumCostPageMiss;
}
+/*
+ * Handle receipt of an interrupt indicating an update of the shared vacuum
+ * delay parameters.
+ */
+void
+HandleParallelVacuumParamsUpdate(void)
+{
+ if (ParallelVacuumSharedCostParams == NULL)
+ return;
+
+ /*
+ * Update the shared delay parameters at the next vacuum_delay_point()
+ * time.
+ */
+ ParallelVacuumParamsUpdatePending = true;
+}
+
/*
* Updates the cost-based vacuum delay parameters for parallel autovacuum
* workers.
@@ -636,40 +648,18 @@ parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
void
parallel_vacuum_update_shared_delay_params(void)
{
- uint32 params_generation;
-
Assert(IsParallelWorker());
+ Assert(ParallelVacuumSharedCostParams);
- /* Quick return if the worker is not running for the autovacuum */
- if (pv_shared_cost_params == NULL)
- return;
-
- params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
- Assert(shared_params_generation_local <= params_generation);
-
- /* Return if parameters had not changed in the leader */
- if (params_generation == shared_params_generation_local)
- return;
-
- SpinLockAcquire(&pv_shared_cost_params->mutex);
- VacuumCostDelay = pv_shared_cost_params->cost_delay;
- VacuumCostLimit = pv_shared_cost_params->cost_limit;
- VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
- VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
- VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
- SpinLockRelease(&pv_shared_cost_params->mutex);
+ SpinLockAcquire(&ParallelVacuumSharedCostParams->mutex);
+ VacuumCostDelay = ParallelVacuumSharedCostParams->cost_delay;
+ VacuumCostLimit = ParallelVacuumSharedCostParams->cost_limit;
+ VacuumCostPageDirty = ParallelVacuumSharedCostParams->cost_page_dirty;
+ VacuumCostPageHit = ParallelVacuumSharedCostParams->cost_page_hit;
+ VacuumCostPageMiss = ParallelVacuumSharedCostParams->cost_page_miss;
+ SpinLockRelease(&ParallelVacuumSharedCostParams->mutex);
VacuumUpdateCosts();
-
- shared_params_generation_local = params_generation;
-
- elog(DEBUG2,
- "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
- vacuum_cost_limit,
- vacuum_cost_delay,
- VacuumCostPageMiss,
- VacuumCostPageDirty,
- VacuumCostPageHit);
}
/*
@@ -685,30 +675,30 @@ parallel_vacuum_propagate_shared_delay_params(void)
/*
* Quick return if the leader process is not sharing the delay parameters.
*/
- if (pv_shared_cost_params == NULL)
+ if (ParallelVacuumSharedCostParams == NULL)
return;
/*
* Check if any delay parameters have changed. We can read them without
* locks as only the leader can modify them.
*/
- if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
- vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
- VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
- VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
- VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ if (vacuum_cost_delay == ParallelVacuumSharedCostParams->cost_delay &&
+ vacuum_cost_limit == ParallelVacuumSharedCostParams->cost_limit &&
+ VacuumCostPageDirty == ParallelVacuumSharedCostParams->cost_page_dirty &&
+ VacuumCostPageHit == ParallelVacuumSharedCostParams->cost_page_hit &&
+ VacuumCostPageMiss == ParallelVacuumSharedCostParams->cost_page_miss)
return;
/* Update the shared delay parameters */
- SpinLockAcquire(&pv_shared_cost_params->mutex);
- parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
- SpinLockRelease(&pv_shared_cost_params->mutex);
+ SpinLockAcquire(&ParallelVacuumSharedCostParams->mutex);
+ parallel_vacuum_set_cost_parameters(ParallelVacuumSharedCostParams);
+ SpinLockRelease(&ParallelVacuumSharedCostParams->mutex);
- /*
- * Increment the generation of the parameters, i.e. let parallel workers
- * know that they should re-read shared cost params.
- */
- pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+ Assert(CurrentParallelVacuumState);
+ SendProcSignalToParallelWorkers(CurrentParallelVacuumState->pcxt,
+ PROCSIG_PARALLEL_VACUUM_PARAMS_UPDATE);
+
+ elog(LOG, "XXX SEND PROCSIGS");
}
/*
@@ -912,16 +902,6 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
pvs->pcxt->nworkers_launched, nworkers)));
}
-#ifdef USE_INJECTION_POINTS
-
- /*
- * Used by tests to pause after workers are launched but before index
- * vacuuming begins.
- */
- if (AmAutoVacuumWorkerProcess() && nworkers > 0)
- INJECTION_POINT("autovacuum-leader-before-indexes-processing", NULL);
-#endif
-
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
@@ -1262,10 +1242,10 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
* Parallel autovacuum workers initialize cost-based delay parameters
* from the leader's shared state rather than GUC defaults, because
* the leader may have applied per-table or autovacuum-specific
- * overrides. pv_shared_cost_params must be set before calling
- * parallel_vacuum_update_shared_delay_params().
+ * overrides. ParallelVacuumSharedCostParams must be set before
+ * calling parallel_vacuum_update_shared_delay_params().
*/
- pv_shared_cost_params = &(shared->cost_params);
+ ParallelVacuumSharedCostParams = &(shared->cost_params);
parallel_vacuum_update_shared_delay_params();
}
else
@@ -1327,7 +1307,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
FreeAccessStrategy(pvs.bstrategy);
if (shared->is_autovacuum)
- pv_shared_cost_params = NULL;
+ ParallelVacuumSharedCostParams = NULL;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b..bf4f17ca8b4 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -19,6 +19,7 @@
#include "access/parallel.h"
#include "commands/async.h"
+#include "commands/vacuum.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "port/pg_bitutils.h"
@@ -703,6 +704,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT))
HandleRecoveryConflictInterrupt();
+ if (CheckProcSignal(PROCSIG_PARALLEL_VACUUM_PARAMS_UPDATE))
+ HandleParallelVacuumParamsUpdate();
+
SetLatch(MyLatch);
}
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 60f857675e0..90084581f7c 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -19,6 +19,7 @@
#include "access/xlogdefs.h"
#include "lib/ilist.h"
#include "postmaster/bgworker.h"
+#include "storage/procsignal.h"
#include "storage/shm_mq.h"
#include "storage/shm_toc.h"
@@ -71,6 +72,7 @@ extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
extern void DestroyParallelContext(ParallelContext *pcxt);
extern bool ParallelContextActive(void);
+extern void SendProcSignalToParallelWorkers(ParallelContext *pcxt, ProcSignalReason reason);
extern void HandleParallelMessageInterrupt(void);
extern void ProcessParallelMessages(void);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 69fec07491b..8b59e3ae1c2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -360,6 +360,10 @@ extern PGDLLIMPORT int vacuum_cost_limit;
extern PGDLLIMPORT int64 parallel_vacuum_worker_delay_ns;
+extern PGDLLIMPORT struct PVSharedCostParams *ParallelVacuumSharedCostParams;
+extern PGDLLIMPORT struct ParallelVacuumState *CurrentParallelVacuumState;
+extern PGDLLIMPORT volatile sig_atomic_t ParallelVacuumParamsUpdatePending;
+
/* in commands/vacuum.c */
extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
extern void vacuum(List *relations, const VacuumParams *params,
@@ -425,6 +429,7 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
extern void parallel_vacuum_update_shared_delay_params(void);
extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
+extern void HandleParallelVacuumParamsUpdate(void);
/* in commands/analyze.c */
extern void analyze_rel(Oid relid, RangeVar *relation,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..ac9a6e841fd 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -39,9 +39,11 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT, /* backend is blocking recovery, check
* PGPROC->pendingRecoveryConflicts for the
* reason */
+ PROCSIG_PARALLEL_VACUUM_PARAMS_UPDATE, /* parallel autovacuum worker:
+ * reload cost params */
} ProcSignalReason;
-#define NUM_PROCSIGNALS (PROCSIG_RECOVERY_CONFLICT + 1)
+#define NUM_PROCSIGNALS (PROCSIG_PARALLEL_VACUUM_PARAMS_UPDATE + 1)
typedef enum
{
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
index 188ec9f96a2..d91a3405e34 100644
--- a/src/test/modules/test_autovacuum/Makefile
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -4,10 +4,6 @@ PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
TAP_TESTS = 1
-EXTRA_INSTALL = src/test/modules/injection_points
-
-export enable_injection_points
-
ifdef USE_PGXS
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
index 86e392bc0de..94a3e8a038d 100644
--- a/src/test/modules/test_autovacuum/meson.build
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -5,9 +5,6 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {
- 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
- },
'tests': [
't/001_parallel_autovacuum.pl',
],
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
index 9e65eafb549..6536a3b265a 100644
--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -9,11 +9,6 @@ use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
-if ($ENV{enable_injection_points} ne 'yes')
-{
- plan skip_all => 'Injection points not supported by this build';
-}
-
# Before each test we should disable autovacuum for 'test_autovac' table and
# generate some dead tuples in it. Returns the current autovacuum_count of
# the table test_autovac.
@@ -66,20 +61,6 @@ log_autovacuum_min_duration = -1
});
$node->start;
-# Check if the extension injection_points is available, as it may be
-# possible that this script is run with installcheck, where the module
-# would not be installed by default.
-if (!$node->check_extension('injection_points'))
-{
- plan skip_all => 'Extension injection_points not installed';
-}
-
-# Create all functions needed for testing
-$node->safe_psql(
- 'postgres', qq{
- CREATE EXTENSION injection_points;
-});
-
my $indexes_num = 3;
my $initial_rows_num = 10_000;
my $autovacuum_parallel_workers = 2;
@@ -134,64 +115,5 @@ ok( $node->log_contains(
qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
$log_offset));
-# Test 2:
-# Check whether parallel autovacuum leader can propagate cost-based parameters
-# to the parallel workers.
-
-$av_count = prepare_for_next_test($node, 2);
-$log_offset = -s $node->logfile;
-
-$node->safe_psql(
- 'postgres', qq{
- SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
- SELECT injection_points_attach('autovacuum-leader-before-indexes-processing', 'wait');
-
- ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
-});
-
-# Wait until parallel autovacuum is inited
-$node->wait_for_event('autovacuum worker',
- 'autovacuum-start-parallel-vacuum');
-
-# Update the shared cost-based delay parameters.
-$node->safe_psql(
- 'postgres', qq{
- ALTER SYSTEM SET vacuum_cost_limit = 500;
- ALTER SYSTEM SET vacuum_cost_page_miss = 10;
- ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
- ALTER SYSTEM SET vacuum_cost_page_hit = 10;
- SELECT pg_reload_conf();
-});
-
-# Resume the leader process to update the shared parameters during heap scan (i.e.
-# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
-# before vacuuming indexes due to the injection point.
-$node->safe_psql(
- 'postgres', qq{
- SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
-});
-$node->wait_for_event('autovacuum worker',
- 'autovacuum-leader-before-indexes-processing');
-
-# Check whether parallel worker successfully updated all parameters during
-# index processing
-$node->wait_for_log(
- qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=2, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
- $log_offset);
-
-$node->safe_psql(
- 'postgres', qq{
- SELECT injection_points_wakeup('autovacuum-leader-before-indexes-processing');
-});
-
-wait_for_autovacuum_complete($node, $av_count);
-
-# Cleanup
-$node->safe_psql(
- 'postgres', qq{
- SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
- SELECT injection_points_detach('autovacuum-leader-before-indexes-processing');
-});
-
$node->stop;
done_testing();
--
2.53.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-02 11:02 Alexander Korotkov <[email protected]>
parent: Masahiko Sawada <[email protected]>
1 sibling, 1 reply; 112+ messages in thread
From: Alexander Korotkov @ 2026-04-02 11:02 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Daniil Davydov <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi!
On Wed, Apr 1, 2026 at 9:55 PM Masahiko Sawada <[email protected]> wrote:
>
> On Mon, Mar 30, 2026 at 5:14 PM SATYANARAYANA NARLAPURAM
> <[email protected]> wrote:
> >
> > Hi
> >
> > On Mon, Mar 30, 2026 at 1:44 AM Daniil Davydov <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> On Mon, Mar 30, 2026 at 7:17 AM SATYANARAYANA NARLAPURAM
> >> <[email protected]> wrote:
> >> >
> >> > Thank you for working on this, very useful feature. Sharing a few thoughts:
> >> >
> >> > 1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM resources in parallel_vacuum_compute_workers?
> >>
> >> Actually, autovacuum_max_parallel_workers is already limited by
> >> max_parallel_workers. It is not clear for me why we allow setting this GUC
> >> higher than max_parallel_workers, but if this happens, I think it is a user's
> >> misconfiguration.
> >>
> >> > 2. Is it intentional that other autovacuum workers not yield cost limits to the parallel auto vacuum workers? Cost limits are distributed first equally to the autovacuum workers.
> >> > and then they share that. Therefore, parallel workers will be heavily throttled. IIUC, this problem doesn't exist with manual vacuum.
> >> > If we don't fix this, at least we should document this.
> >>
> >> Parallel a/v workers inherit cost based parameters (including the
> >> vacuum_cost_limit) from the leader worker. Do you mean that this can be too
> >> low value for parallel operation? If so, user can manually increase the
> >> vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too
> >> much (due to cost delay).
> >>
> >> BTW, describing the cost limit propagation to the parallel a/v workers is
> >> worth mentioning in the documentation. I'll add it in the next patch version.
> >>
> >> > 3. Additionally, is there a point where, based on the cost limits, launching additional workers becomes counterproductive compared to running fewer workers and preventing it?
> >>
> >> I don't think that we can possibly find a universal limit that will be
> >> appropriate for all possible configurations. By now we are using a pretty
> >> simple formula for parallel degree calculation. Since user have several ways
> >> to affect this formula, I guess that there will be no problems with it (except
> >> my concerns about opt-out style).
> >>
> >> > 4. Would it make sense to add a table level override to disable parallelism or set parallel worker count?
> >>
> >> We already have the "autovacuum_parallel_workers" reloption that is used as
> >> an additional limit for the number of parallel workers. In particular, this
> >> reloption can be used to disable parallelism at all.
> >>
> >> >
> >> > I ran some perf tests to show the improvements with parallel vacuum and shared below.
> >>
> >> Thank you very much!
> >>
> >> > Observations:
> >> >
> >> > 1. Parallel autovacuum provides consistent speedup. With cost_limit=200 and
> >> > 7 workers, vacuum completes 1.41x faster (71s -> 50s). With cost_limit=60,
> >> > the speedup is 1.25x (194s -> 154s).
> >> > 2. I see the benefit comes from parallelizing index vacuum. With 8 indexes totaling
> >> > ~530 MB, parallel workers scan indexes concurrently instead of the leader
> >> > scanning them one by one. The leader's CPU user time drops from ~3s to
> >> > ~0.8s as index work is offloaded
> >> >
> >>
> >> 1.41 speedup with 7 parallel workers may not seem like a great win, but it is
> >> a whole time of autovacuum operation (not only index bulkdel/cleanup) with
> >> pretty small indexes.
> >>
> >> May I ask you to run the same test with a higher table's size (several dozen
> >> gigabytes)? I think the results will be more "expressive".
> >
> >
> > I ran it with a Billion rows in a table with 8 indexes. The improvement with 7 workers is 1.8x.
> > Please note that there is a fixed overhead in other vacuum steps, for example heap scan.
> > In the environments where cost-based delay is used (the default), benefits will be modest
> > unless vacuum_cost_delay is set to sufficiently large value.
> >
> > Hardware:
> > CPU: Intel Xeon Platinum 8573C, 1 socket × 8 cores × 2 threads = 16 vCPUs
> > RAM: 128 GB (131,900 MB)
> > Swap: None
> >
> > Workload Description
> >
> > Table Schema:
> > CREATE TABLE avtest (
> > id bigint PRIMARY KEY,
> > col1 int, -- random()*1e9
> > col2 int, -- random()*1e9
> > col3 int, -- random()*1e9
> > col4 int, -- random()*1e9
> > col5 int, -- random()*1e9
> > col6 text, -- 'text_' || random()*1e6 (short text ~10 chars)
> > col7 timestamp, -- now() - random()*365 days
> > padding text -- repeat('x', 50)
> > ) WITH (fillfactor = 90);
> >
> > Indexes (8 total):
> > avtest_pkey — btree on (id) bigint
> > idx_av_col1 — btree on (col1) int
> > idx_av_col2 — btree on (col2) int
> > idx_av_col3 — btree on (col3) int
> > idx_av_col4 — btree on (col4) int
> > idx_av_col5 — btree on (col5) int
> > idx_av_col6 — btree on (col6) text
> > idx_av_col7 — btree on (col7) timestamp
> >
> > Dead Tuple Generation:
> > DELETE FROM avtest WHERE id % 5 IN (1, 2);
> > This deletes exactly 40% of rows, uniformly distributed across all pages.
> >
> > Vacuum Trigger:
> > Autovacuum is triggered naturally by lowering the threshold to 0 and setting
> > scale_factor to a value that causes immediate launch after the DELETE.
> >
> > Worker Configurations Tested:
> > 0 workers — leader-only vacuum (baseline, no parallelism)
> > 2 workers — leader + 2 parallel workers (3 processes total)
> > 4 workers — leader + 4 parallel workers (5 processes total)
> > 7 workers — leader + 7 parallel workers (8 processes total, 1 per index)
> >
> > Dataset:
> > Rows: 1,000,000,000
> > Heap size: 139 GB
> > Total size: 279 GB (heap + 8 indexes)
> > Dead tuples: 400,000,000 (40%)
> >
> > Index Sizes:
> > avtest_pkey 21 GB (bigint)
> > idx_av_col7 21 GB (timestamp)
> > idx_av_col1 18 GB (int)
> > idx_av_col2 18 GB (int)
> > idx_av_col3 18 GB (int)
> > idx_av_col4 18 GB (int)
> > idx_av_col5 18 GB (int)
> > idx_av_col6 7 GB (text — shorter keys, smaller index)
> > Total indexes: 139 GB
> >
> > Server Settings:
> > shared_buffers = 96GB
> > maintenance_work_mem = 1GB
> > max_wal_size = 100GB
> > checkpoint_timeout = 1h
> > autovacuum_vacuum_cost_delay = 0ms (NO throttling)
> > autovacuum_vacuum_cost_limit = 1000
> >
> >
> > Summary:
> >
> > Workers Avg(s) Min(s) Max(s) Speedup Time Saved
> > ------- ------ ------ ------ ------- ----------
> > 0 1645.93 1645.01 1646.84 1.00x —
> > 2 1276.35 1275.64 1277.05 1.29x 369.58s (6.2 min)
> > 4 1052.62 1048.92 1056.32 1.56x 593.31s (9.9 min)
> > 7 892.23 886.59 897.86 1.84x 753.70s (12.6 min)
> >
>
> Thank you for sharing the performance test results!
>
> While the benchmark results look good to me, have you compared the
> performance differences between parallel vacuum in the VACUUM command
> (with the PARALLEL option) and parallel vacuum in autovacuum? Since
> parallel autovacuum introduces some logic to check for delay parameter
> updates, I thought it was worth verifying if this adds any overhead.
>
> BTW, in my view, the most challenging part of this patch is the
> propagation logic for vacuum delay parameters. This propagation is
> necessary because, unlike manual VACUUM, autovacuum workers can reload
> their configuration during operation. We must ensure that parallel
> workers stay synchronized with these updated parameters.
>
> The current patch implements this in vacuumparallel.c: the leader
> shares delay parameters in DSM and updates them (if any vacuum delay
> parameters are updated) after a config reload, while workers poll for
> updates at every vacuum_delay_point() call to refresh their local
> variables.
>
> Another possible approach would be an event-driven model where the
> leader notifies workers after updating shared parameters—for example,
> by adding a shm_mq between the leader (as the sender) and each worker
> (as the receiver).
>
> I've compared these two ideas and opted for the former (polling).
> While a polling approach could theoretically be costly, the current
> implementation is self-contained within the parallel vacuum logic and
> does not touch the core parallel query infrastructure. The
> notification approach might look more elegant, but I'm concerned it
> adds unnecessary complexity just for the autovacuum case. Since the
> polling is essentially just checking an atomic variable, the overhead
> should be negligible.
>
> To verify this, I conducted benchmarks comparing the whole execution
> time and index vacuuming duration.
>
> Setup:
>
> - Disabled (auto) vacuum delays and buffer usage limits.
> - Parallel autovacuum with 1 worker on a table with 2 indexes (approx.
> 4 GB each).
> - 5 runs.
>
> Case 1: The latest patch (with polling)
>
> Average: 3.95s (Index: 1.54s)
> Median: 3.62s (Index: 1.37s)
>
> Case 2: The latest patch without polling
>
> Average: 3.98s (Index: 1.56s)
> Median: 3.70s (Index: 1.40s)
>
> Note that in order to simulate the code that doesn't have the polling,
> I reverted the following change:
>
> - if (InterruptPending ||
> - (!VacuumCostActive && !ConfigReloadPending))
> + if (InterruptPending)
> + return;
> +
> + if (IsParallelWorker())
> + {
> + /*
> + * Update cost-based vacuum delay parameters for a parallel autovacuum
> + * worker if any changes are detected.
> + */
> + parallel_vacuum_update_shared_delay_params();
> + }
> +
> + if (!VacuumCostActive && !ConfigReloadPending)
>
> The parallel vacuum workers don't check the shared vacuum delay
> parameter at all, which is still fine as I disabled vacuum delays.
>
> Overall, the results show no noticeable overhead from the polling approach.
I would say this polling approach is very cheap. When there are no
updates, it only has to check a single 32-bit value from shared
memory. And that value doesn't get updated frequently; it's good for
caching. No wonder we see no measurable overhead.
Regarding the event-driven approach, given that the parallel worker
process is busy with other jobs (doing actual vacuuming), it would
anyway have to poll for new events from time to time. Thus, I don't
think it's possible to organize polling for new events any cheaper
than the current approach of polling for updates in shmem. If the
worker process was just waiting for GUC updates without any other
jobs, then, for instance, waiting on the latch would be cheaper than
polling in a loop, but that's not our case.
I don't see the current polling approach for GUC updates as
performance problematic.
------
Regards,
Alexander Korotkov
Supabase
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-02 15:10 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-04-02 15:10 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Thu, Apr 2, 2026 at 6:16 AM Masahiko Sawada <[email protected]> wrote:
>
> Thank you for updating the patch! I found a bug in the following code:
>
> @@ -457,6 +534,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs,
> IndexBulkDeleteResult **istats)
> DestroyParallelContext(pvs->pcxt);
> ExitParallelMode();
>
> + if (AmAutoVacuumWorkerProcess())
> + pv_shared_cost_params = NULL;
> +
>
> If an autovacuum worker raises an error during parallel vacuum, it
> doesn't pv_shared_cost_params. Then, if it doesn't use parallel vacuum
> on the next table to vacuum, it would end up with SEGV as it attempts
> to propagate the vacuum delay parameters.
Ouch. Indeed, I did not foresee this.
Thank you for noticing it!
I think we should add some cleanup for autovacuum near the ParallelContext
cleanup, since they are interconnected. I also want to return our tests that
are triggering ERROR/PANIC in the leader worker in order to check whether all
resources are released. I hope I will be able to get to that by tomorrow
evening.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] 0001-Make-sure-that-all-recourses-have-been-released-in-p.patch (2.1K, 2-0001-Make-sure-that-all-recourses-have-been-released-in-p.patch)
download | inline diff:
From b649988718442143256eae59f29b770d08f1fc97 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Thu, 2 Apr 2026 22:08:06 +0700
Subject: [PATCH] Make sure that all recourses have been released in parallel
autovacuum
---
src/backend/access/transam/parallel.c | 7 +++++++
src/backend/commands/vacuumparallel.c | 15 +++++++++++++++
src/include/access/parallel.h | 3 +++
3 files changed, 25 insertions(+)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ab1dfb30e73..81bca48bcfa 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1292,6 +1292,13 @@ AtEOXact_Parallel(bool isCommit)
elog(WARNING, "leaked parallel context");
DestroyParallelContext(pcxt);
}
+
+ /*
+ * Parallel autovacuum may have resources that depend on ParallelContext,
+ * but are local to vacuumparallel.c
+ */
+ if (AmAutoVacuumWorkerProcess())
+ AtEOXact_ParallelAutovacuum(isCommit);
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bac3bd28214..68d4a25528d 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -541,6 +541,21 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
pfree(pvs);
}
+/*
+ * End-of-transaction cleanup for parallel autovacuum.
+ */
+void
+AtEOXact_ParallelAutovacuum(bool isCommit)
+{
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ if (isCommit)
+ elog(WARNING, "leaked parallel autovacuum state");
+
+ pv_shared_cost_params = NULL;
+}
+
/*
* Returns the dead items space and dead items information.
*/
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 60f857675e0..553273b1529 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -80,4 +80,7 @@ extern void ParallelWorkerReportLastRecEnd(XLogRecPtr last_xlog_end);
extern void ParallelWorkerMain(Datum main_arg);
+/* vacuumparallel.c */
+extern void AtEOXact_ParallelAutovacuum(bool isCommit);
+
#endif /* PARALLEL_H */
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-02 23:00 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-02 23:00 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Thu, Apr 2, 2026 at 8:10 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Thu, Apr 2, 2026 at 6:16 AM Masahiko Sawada <[email protected]> wrote:
> >
> > Thank you for updating the patch! I found a bug in the following code:
> >
> > @@ -457,6 +534,9 @@ parallel_vacuum_end(ParallelVacuumState *pvs,
> > IndexBulkDeleteResult **istats)
> > DestroyParallelContext(pvs->pcxt);
> > ExitParallelMode();
> >
> > + if (AmAutoVacuumWorkerProcess())
> > + pv_shared_cost_params = NULL;
> > +
> >
> > If an autovacuum worker raises an error during parallel vacuum, it
> > doesn't pv_shared_cost_params. Then, if it doesn't use parallel vacuum
> > on the next table to vacuum, it would end up with SEGV as it attempts
> > to propagate the vacuum delay parameters.
>
> Ouch. Indeed, I did not foresee this.
> Thank you for noticing it!
>
> I think we should add some cleanup for autovacuum near the ParallelContext
> cleanup, since they are interconnected. I also want to return our tests that
> are triggering ERROR/PANIC in the leader worker in order to check whether all
> resources are released. I hope I will be able to get to that by tomorrow
> evening.
I think that the beginning of vacuum loop (in PG_TRY() block in
vacuum()) seems better place as we're resetting vacuum delay
parameters:
in_vacuum = true;
VacuumFailsafeActive = false;
VacuumUpdateCosts();
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = NULL;
VacuumActiveNWorkers = NULL;
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-02 23:30 Masahiko Sawada <[email protected]>
parent: Alexander Korotkov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-02 23:30 UTC (permalink / raw)
To: Alexander Korotkov <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Daniil Davydov <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Thu, Apr 2, 2026 at 4:02 AM Alexander Korotkov <[email protected]> wrote:
>
> Hi!
>
> On Wed, Apr 1, 2026 at 9:55 PM Masahiko Sawada <[email protected]> wrote:
> >
> > On Mon, Mar 30, 2026 at 5:14 PM SATYANARAYANA NARLAPURAM
> > <[email protected]> wrote:
> > >
> > > Hi
> > >
> > > On Mon, Mar 30, 2026 at 1:44 AM Daniil Davydov <[email protected]> wrote:
> > >>
> > >> Hi,
> > >>
> > >> On Mon, Mar 30, 2026 at 7:17 AM SATYANARAYANA NARLAPURAM
> > >> <[email protected]> wrote:
> > >> >
> > >> > Thank you for working on this, very useful feature. Sharing a few thoughts:
> > >> >
> > >> > 1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM resources in parallel_vacuum_compute_workers?
> > >>
> > >> Actually, autovacuum_max_parallel_workers is already limited by
> > >> max_parallel_workers. It is not clear for me why we allow setting this GUC
> > >> higher than max_parallel_workers, but if this happens, I think it is a user's
> > >> misconfiguration.
> > >>
> > >> > 2. Is it intentional that other autovacuum workers not yield cost limits to the parallel auto vacuum workers? Cost limits are distributed first equally to the autovacuum workers.
> > >> > and then they share that. Therefore, parallel workers will be heavily throttled. IIUC, this problem doesn't exist with manual vacuum.
> > >> > If we don't fix this, at least we should document this.
> > >>
> > >> Parallel a/v workers inherit cost based parameters (including the
> > >> vacuum_cost_limit) from the leader worker. Do you mean that this can be too
> > >> low value for parallel operation? If so, user can manually increase the
> > >> vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too
> > >> much (due to cost delay).
> > >>
> > >> BTW, describing the cost limit propagation to the parallel a/v workers is
> > >> worth mentioning in the documentation. I'll add it in the next patch version.
> > >>
> > >> > 3. Additionally, is there a point where, based on the cost limits, launching additional workers becomes counterproductive compared to running fewer workers and preventing it?
> > >>
> > >> I don't think that we can possibly find a universal limit that will be
> > >> appropriate for all possible configurations. By now we are using a pretty
> > >> simple formula for parallel degree calculation. Since user have several ways
> > >> to affect this formula, I guess that there will be no problems with it (except
> > >> my concerns about opt-out style).
> > >>
> > >> > 4. Would it make sense to add a table level override to disable parallelism or set parallel worker count?
> > >>
> > >> We already have the "autovacuum_parallel_workers" reloption that is used as
> > >> an additional limit for the number of parallel workers. In particular, this
> > >> reloption can be used to disable parallelism at all.
> > >>
> > >> >
> > >> > I ran some perf tests to show the improvements with parallel vacuum and shared below.
> > >>
> > >> Thank you very much!
> > >>
> > >> > Observations:
> > >> >
> > >> > 1. Parallel autovacuum provides consistent speedup. With cost_limit=200 and
> > >> > 7 workers, vacuum completes 1.41x faster (71s -> 50s). With cost_limit=60,
> > >> > the speedup is 1.25x (194s -> 154s).
> > >> > 2. I see the benefit comes from parallelizing index vacuum. With 8 indexes totaling
> > >> > ~530 MB, parallel workers scan indexes concurrently instead of the leader
> > >> > scanning them one by one. The leader's CPU user time drops from ~3s to
> > >> > ~0.8s as index work is offloaded
> > >> >
> > >>
> > >> 1.41 speedup with 7 parallel workers may not seem like a great win, but it is
> > >> a whole time of autovacuum operation (not only index bulkdel/cleanup) with
> > >> pretty small indexes.
> > >>
> > >> May I ask you to run the same test with a higher table's size (several dozen
> > >> gigabytes)? I think the results will be more "expressive".
> > >
> > >
> > > I ran it with a Billion rows in a table with 8 indexes. The improvement with 7 workers is 1.8x.
> > > Please note that there is a fixed overhead in other vacuum steps, for example heap scan.
> > > In the environments where cost-based delay is used (the default), benefits will be modest
> > > unless vacuum_cost_delay is set to sufficiently large value.
> > >
> > > Hardware:
> > > CPU: Intel Xeon Platinum 8573C, 1 socket × 8 cores × 2 threads = 16 vCPUs
> > > RAM: 128 GB (131,900 MB)
> > > Swap: None
> > >
> > > Workload Description
> > >
> > > Table Schema:
> > > CREATE TABLE avtest (
> > > id bigint PRIMARY KEY,
> > > col1 int, -- random()*1e9
> > > col2 int, -- random()*1e9
> > > col3 int, -- random()*1e9
> > > col4 int, -- random()*1e9
> > > col5 int, -- random()*1e9
> > > col6 text, -- 'text_' || random()*1e6 (short text ~10 chars)
> > > col7 timestamp, -- now() - random()*365 days
> > > padding text -- repeat('x', 50)
> > > ) WITH (fillfactor = 90);
> > >
> > > Indexes (8 total):
> > > avtest_pkey — btree on (id) bigint
> > > idx_av_col1 — btree on (col1) int
> > > idx_av_col2 — btree on (col2) int
> > > idx_av_col3 — btree on (col3) int
> > > idx_av_col4 — btree on (col4) int
> > > idx_av_col5 — btree on (col5) int
> > > idx_av_col6 — btree on (col6) text
> > > idx_av_col7 — btree on (col7) timestamp
> > >
> > > Dead Tuple Generation:
> > > DELETE FROM avtest WHERE id % 5 IN (1, 2);
> > > This deletes exactly 40% of rows, uniformly distributed across all pages.
> > >
> > > Vacuum Trigger:
> > > Autovacuum is triggered naturally by lowering the threshold to 0 and setting
> > > scale_factor to a value that causes immediate launch after the DELETE.
> > >
> > > Worker Configurations Tested:
> > > 0 workers — leader-only vacuum (baseline, no parallelism)
> > > 2 workers — leader + 2 parallel workers (3 processes total)
> > > 4 workers — leader + 4 parallel workers (5 processes total)
> > > 7 workers — leader + 7 parallel workers (8 processes total, 1 per index)
> > >
> > > Dataset:
> > > Rows: 1,000,000,000
> > > Heap size: 139 GB
> > > Total size: 279 GB (heap + 8 indexes)
> > > Dead tuples: 400,000,000 (40%)
> > >
> > > Index Sizes:
> > > avtest_pkey 21 GB (bigint)
> > > idx_av_col7 21 GB (timestamp)
> > > idx_av_col1 18 GB (int)
> > > idx_av_col2 18 GB (int)
> > > idx_av_col3 18 GB (int)
> > > idx_av_col4 18 GB (int)
> > > idx_av_col5 18 GB (int)
> > > idx_av_col6 7 GB (text — shorter keys, smaller index)
> > > Total indexes: 139 GB
> > >
> > > Server Settings:
> > > shared_buffers = 96GB
> > > maintenance_work_mem = 1GB
> > > max_wal_size = 100GB
> > > checkpoint_timeout = 1h
> > > autovacuum_vacuum_cost_delay = 0ms (NO throttling)
> > > autovacuum_vacuum_cost_limit = 1000
> > >
> > >
> > > Summary:
> > >
> > > Workers Avg(s) Min(s) Max(s) Speedup Time Saved
> > > ------- ------ ------ ------ ------- ----------
> > > 0 1645.93 1645.01 1646.84 1.00x —
> > > 2 1276.35 1275.64 1277.05 1.29x 369.58s (6.2 min)
> > > 4 1052.62 1048.92 1056.32 1.56x 593.31s (9.9 min)
> > > 7 892.23 886.59 897.86 1.84x 753.70s (12.6 min)
> > >
> >
> > Thank you for sharing the performance test results!
> >
> > While the benchmark results look good to me, have you compared the
> > performance differences between parallel vacuum in the VACUUM command
> > (with the PARALLEL option) and parallel vacuum in autovacuum? Since
> > parallel autovacuum introduces some logic to check for delay parameter
> > updates, I thought it was worth verifying if this adds any overhead.
> >
> > BTW, in my view, the most challenging part of this patch is the
> > propagation logic for vacuum delay parameters. This propagation is
> > necessary because, unlike manual VACUUM, autovacuum workers can reload
> > their configuration during operation. We must ensure that parallel
> > workers stay synchronized with these updated parameters.
> >
> > The current patch implements this in vacuumparallel.c: the leader
> > shares delay parameters in DSM and updates them (if any vacuum delay
> > parameters are updated) after a config reload, while workers poll for
> > updates at every vacuum_delay_point() call to refresh their local
> > variables.
> >
> > Another possible approach would be an event-driven model where the
> > leader notifies workers after updating shared parameters—for example,
> > by adding a shm_mq between the leader (as the sender) and each worker
> > (as the receiver).
> >
> > I've compared these two ideas and opted for the former (polling).
> > While a polling approach could theoretically be costly, the current
> > implementation is self-contained within the parallel vacuum logic and
> > does not touch the core parallel query infrastructure. The
> > notification approach might look more elegant, but I'm concerned it
> > adds unnecessary complexity just for the autovacuum case. Since the
> > polling is essentially just checking an atomic variable, the overhead
> > should be negligible.
> >
> > To verify this, I conducted benchmarks comparing the whole execution
> > time and index vacuuming duration.
> >
> > Setup:
> >
> > - Disabled (auto) vacuum delays and buffer usage limits.
> > - Parallel autovacuum with 1 worker on a table with 2 indexes (approx.
> > 4 GB each).
> > - 5 runs.
> >
> > Case 1: The latest patch (with polling)
> >
> > Average: 3.95s (Index: 1.54s)
> > Median: 3.62s (Index: 1.37s)
> >
> > Case 2: The latest patch without polling
> >
> > Average: 3.98s (Index: 1.56s)
> > Median: 3.70s (Index: 1.40s)
> >
> > Note that in order to simulate the code that doesn't have the polling,
> > I reverted the following change:
> >
> > - if (InterruptPending ||
> > - (!VacuumCostActive && !ConfigReloadPending))
> > + if (InterruptPending)
> > + return;
> > +
> > + if (IsParallelWorker())
> > + {
> > + /*
> > + * Update cost-based vacuum delay parameters for a parallel autovacuum
> > + * worker if any changes are detected.
> > + */
> > + parallel_vacuum_update_shared_delay_params();
> > + }
> > +
> > + if (!VacuumCostActive && !ConfigReloadPending)
> >
> > The parallel vacuum workers don't check the shared vacuum delay
> > parameter at all, which is still fine as I disabled vacuum delays.
> >
> > Overall, the results show no noticeable overhead from the polling approach.
>
> I would say this polling approach is very cheap. When there are no
> updates, it only has to check a single 32-bit value from shared
> memory. And that value doesn't get updated frequently; it's good for
> caching. No wonder we see no measurable overhead.
Thank you for the comments!
>
> Regarding the event-driven approach, given that the parallel worker
> process is busy with other jobs (doing actual vacuuming), it would
> anyway have to poll for new events from time to time. Thus, I don't
> think it's possible to organize polling for new events any cheaper
> than the current approach of polling for updates in shmem.
What do you think about the idea of using proc signals like the patch
I've sent recently[1]? With that approach, workers have to check the
local variable. It seems slightly cheaper and can use the existing
logic.
[1] https://www.postgresql.org/message-id/CAD21AoBm0cxQjtWuY0f7%2BaT4UiRV%2B%2BaFKkzjj6vmERTj_UFnxA%40ma...
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-03 11:43 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Daniil Davydov @ 2026-04-03 11:43 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Fri, Apr 3, 2026 at 6:31 AM Masahiko Sawada <[email protected]> wrote:
>
> What do you think about the idea of using proc signals like the patch
> I've sent recently[1]? With that approach, workers have to check the
> local variable. It seems slightly cheaper and can use the existing
> logic.
>
Thank you for the patch!
1) Maybe we should implement this logic within ParallelMessages? For example,
I see this ParallelMessages use case in parallel a/v :
/*
* Call the parallel variant of pgstat_progress_incr_param so workers can
* report progress of index vacuum to the leader.
*/
pgstat_progress_parallel_incr_param(PROGRESS_VACUUM_INDEXES_PROCESSED, 1);
I.e. parallel a/v workers already communicate with a leader via
ParallelMessages, so it will be convenient to extend this protocol by a new
message.
2) I don't think that the difference between accessing atomic and local
variable can be measured for parallel workers. But sending a signal to every
parallel worker is surely slower than just incrementing an atomic variable.
IIUC you created this patch in order to solve the task of using an existing
infrastructure instead of creating a new utilitarian solution. However, I think
that both the polling approach and signalling approach (in its current
implementation) are basically equal. I mean that in both cases we have an
autovacuum-specific mechanism to share particular parameters between particular
workers.
I will try to explain how I see the solution to this problem. :
Your implementation can be made more abstract, so that it becomes a new
internal mechanism that other modules can use in the future. E.g. we can create
an interface that allows any parallel leader (not necessarily just an
autovacuum leader) to inform its parallel workers that some config parameters
have been changed. At the same time, parallel workers can use this interface in
order to specify, which parameters (or groups of parameters) they want to
consume from the leader. What do you think?
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-03 13:45 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-04-03 13:45 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Fri, Apr 3, 2026 at 6:00 AM Masahiko Sawada <[email protected]> wrote:
>
> On Thu, Apr 2, 2026 at 8:10 AM Daniil Davydov <[email protected]> wrote:
> >
> > I think we should add some cleanup for autovacuum near the ParallelContext
> > cleanup, since they are interconnected. I also want to return our tests that
> > are triggering ERROR/PANIC in the leader worker in order to check whether all
> > resources are released. I hope I will be able to get to that by tomorrow
> > evening.
>
> I think that the beginning of vacuum loop (in PG_TRY() block in
> vacuum()) seems better place as we're resetting vacuum delay
> parameters:
>
> in_vacuum = true;
> VacuumFailsafeActive = false;
> VacuumUpdateCosts();
> VacuumCostBalance = 0;
> VacuumCostBalanceLocal = 0;
> VacuumSharedCostBalance = NULL;
> VacuumActiveNWorkers = NULL;
>
I am still thinking that this pointer is related to the ParallelContext, and it
is a bit confusing that we can manipulate it outside all "parallel" logic.
Since this variable points to the DSM it looks very natural for me if its
lifetime will be similar to the DSM. Please, see attached patch, that resets
this pointer during dsm detaching.
--
Best regards,
Daniil Davydov
Attachments:
[text/x-patch] 0001-Reset-pointer-into-the-going-away-DSM.patch (1.7K, 2-0001-Reset-pointer-into-the-going-away-DSM.patch)
download | inline diff:
From 50052b41b9a8dc270720c0325b3afc270e8f9b5c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 3 Apr 2026 20:31:11 +0700
Subject: [PATCH] Reset pointer into the going away DSM
---
src/backend/commands/vacuumparallel.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bac3bd28214..a01ace9343c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -294,6 +294,7 @@ static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_inde
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
+static void parallel_vacuum_cleanup(dsm_segment *seg, Datum arg);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -467,6 +468,7 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
SpinLockInit(&shared->cost_params.mutex);
pv_shared_cost_params = &(shared->cost_params);
+ on_dsm_detach(pcxt->seg, parallel_vacuum_cleanup, (Datum) 0);
}
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
@@ -541,6 +543,17 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
pfree(pvs);
}
+/*
+ * Cleanup for parallel autovacuum.
+ */
+static void
+parallel_vacuum_cleanup(dsm_segment *seg, Datum arg)
+{
+ /* We need to reset all pointers into the DSM that is going away */
+ Assert(AmAutoVacuumWorkerProcess());
+ pv_shared_cost_params = NULL;
+}
+
/*
* Returns the dead items space and dead items information.
*/
--
2.43.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-04 01:11 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-04 01:11 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Fri, Apr 3, 2026 at 6:45 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Fri, Apr 3, 2026 at 6:00 AM Masahiko Sawada <[email protected]> wrote:
> >
> > On Thu, Apr 2, 2026 at 8:10 AM Daniil Davydov <[email protected]> wrote:
> > >
> > > I think we should add some cleanup for autovacuum near the ParallelContext
> > > cleanup, since they are interconnected. I also want to return our tests that
> > > are triggering ERROR/PANIC in the leader worker in order to check whether all
> > > resources are released. I hope I will be able to get to that by tomorrow
> > > evening.
> >
> > I think that the beginning of vacuum loop (in PG_TRY() block in
> > vacuum()) seems better place as we're resetting vacuum delay
> > parameters:
> >
> > in_vacuum = true;
> > VacuumFailsafeActive = false;
> > VacuumUpdateCosts();
> > VacuumCostBalance = 0;
> > VacuumCostBalanceLocal = 0;
> > VacuumSharedCostBalance = NULL;
> > VacuumActiveNWorkers = NULL;
> >
>
> I am still thinking that this pointer is related to the ParallelContext, and it
> is a bit confusing that we can manipulate it outside all "parallel" logic.
> Since this variable points to the DSM it looks very natural for me if its
> lifetime will be similar to the DSM. Please, see attached patch, that resets
> this pointer during dsm detaching.
Sounds a reasonable apporach.
Regarding the regression tests, ISTM we no longer need
'autovacuum-leader-before-indexes-processing' injection point since it
currently tests that parallel workers update their delay parameters
during the initialization (i.e., in parallel_vacuum_main()). In order
to verify the behavior of workers updating their delay parameters
while processing indexes, we would need another injection ponit to
stop parallel workers, which seems overkill to me. So I removed it but
the test still covers the propagation logic.
Regarding the patch, I don't think it's a good idea to include
bgworker_internals.h from reloptions.c:
@@ -28,6 +28,7 @@
#include "commands/defrem.h"
#include "commands/tablespace.h"
#include "nodes/makefuncs.h"
+#include "postmaster/bgworker_internals.h"
#include "storage/lock.h"
#include "utils/array.h"
#include "utils/attoptcache.h"
@@ -236,6 +237,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be
used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, MAX_PARALLEL_WORKER_LIMIT
+ },
I'd leave the maximum value as 1024.
I've attached patch and please check it. I think it's a good shape and
I'm going to push it next Monday barrying objections.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-patch] v37-0001-Allow-autovacuum-to-use-parallel-vacuum-workers.patch (42.3K, 2-v37-0001-Allow-autovacuum-to-use-parallel-vacuum-workers.patch)
download | inline diff:
From dc6105c860c145a018e928b97a70b84e29d05c02 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Tue, 17 Mar 2026 02:18:09 +0700
Subject: [PATCH v37] Allow autovacuum to use parallel vacuum workers.
Previously, autovacuum always disabled parallel vacuum regardless of
the table's index count or configuration. This commit enables
autovacuum workers to use parallel index vacuuming and index cleanup,
using the same parallel vacuum infrastructure as manual VACUUM.
Two new configuration options control the feature. The GUC
autovacuum_max_parallel_workers sets the maximum number of parallel
workers a single autovacuum worker may launch; it defaults to 0,
preserving existing behavior unless explicitly enabled. The per-table
storage parameter autovacuum_parallel_workers provides per-table limits.
A value of 0 disables parallel vacuum for the table, a positive value
caps the worker count (still bounded by the GUC), and -1 (the default)
defers to the GUC.
To handle cases where autovacuum workers receive a SIGHUP and update
their cost-based vacuum delay parameters mid-operation, a new
propagation mechanism is added to vacuumparallel.c. The leader stores
its effective cost parameters in a DSM segment. Parallel vacuum
workers poll for changes in vacuum_delay_point(); if an update is
detected, they apply the new values locally via VacuumUpdateCosts().
A new test module, src/test/modules/test_autovacuum, is added to
verify that parallel autovacuum workers are correctly launched and
that cost-parameter updates are propagated as expected.
Author: Daniil Davydov <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Sami Imseih <[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Reviewed-by: Bharath Rupireddy <[email protected]>
Reviewed-by: Alexander Korotkov <[email protected]>
Reviewed-by: zengman <[email protected]>
Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com
---
doc/src/sgml/config.sgml | 24 ++
doc/src/sgml/maintenance.sgml | 34 +++
doc/src/sgml/ref/create_table.sgml | 16 ++
doc/src/sgml/ref/vacuum.sgml | 23 +-
src/backend/access/common/reloptions.c | 11 +
src/backend/access/heap/vacuumlazy.c | 12 +
src/backend/commands/vacuum.c | 22 +-
src/backend/commands/vacuumparallel.c | 227 +++++++++++++++++-
src/backend/postmaster/autovacuum.c | 25 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 10 +-
src/backend/utils/misc/guc_parameters.dat | 8 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/commands/vacuum.h | 2 +
src/include/miscadmin.h | 1 +
src/include/utils/rel.h | 2 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 20 ++
src/test/modules/test_autovacuum/meson.build | 15 ++
.../t/001_parallel_autovacuum.pl | 189 +++++++++++++++
src/tools/pgindent/typedefs.list | 1 +
24 files changed, 616 insertions(+), 33 deletions(-)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d3fea738ca3..5840f497817 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2918,6 +2918,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9486,6 +9487,29 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel workers that can be used by a
+ single autovacuum worker to process indexes. This limit applies
+ specifically to the index vacuuming and index cleanup phases (for the
+ details of each autovacuum phase, please refer to <xref linkend="vacuum-phases"/>).
+ The actual number of parallel workers is further limited by
+ <xref linkend="guc-max-parallel-workers"/>. This is the
+ per-autovacuum worker equivalent of the <literal>PARALLEL</literal>
+ option of the <link linkend="sql-vacuum"><command>VACUUM</command></link>
+ command. Setting this value to 0 disables parallel vacuum during autovacuum.
+ The default is 0.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0d2a28207ed..64bbc831343 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -1038,6 +1038,10 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
per-table <literal>autovacuum_vacuum_cost_delay</literal> or
<literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
are not considered in the balancing algorithm.
+ Parallel workers launched for <xref linkend="parallel-vacuum"/> are using
+ the same cost delay parameters as the leader worker. If any of these
+ parameters are changed in the leader worker, it will propagate the new
+ parameter values to all of its parallel workers.
</para>
<para>
@@ -1166,6 +1170,36 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
</sect3>
</sect2>
+
+ <sect2 id="parallel-vacuum" xreflabel="Parallel Vacuum">
+ <title>Parallel Vacuum</title>
+
+ <para>
+ <command>VACUUM</command> can perform index vacuuming and index cleanup
+ phases in parallel using background workers (for the details of each
+ vacuum phase, please refer to <xref linkend="vacuum-phases"/>). The
+ degree of parallelism is determined by the number of indexes on the
+ relation that support parallel vacuum. For manual <command>VACUUM</command>,
+ this is limited by the <literal>PARALLEL</literal> option, which is
+ further capped by <xref linkend="guc-max-parallel-maintenance-workers"/>.
+ For autovacuum, it is limited by the table's
+ <xref linkend="reloption-autovacuum-parallel-workers"/> if any which is
+ capped limited by
+ <xref linkend="guc-autovacuum-max-parallel-workers"/> parameter. Please
+ note that it is not guaranteed that the number of parallel workers that was
+ calculated will be used during execution. It is possible for a vacuum to
+ run with fewer workers than specified, or even with no workers at all.
+ </para>
+
+ <para>
+ An index can participate in parallel vacuum if and only if the size of the
+ index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
+ Only one worker can be used per index. So parallel workers are launched
+ only when there are at least <literal>2</literal> indexes in the table.
+ Workers for vacuum are launched before the start of each phase and exit at
+ the end of the phase. These behaviors might change in a future release.
+ </para>
+ </sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 80829b23945..e342585c7f0 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1738,6 +1738,22 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Per-table value for <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ parameter. If -1 is specified, <varname>autovacuum_max_parallel_workers</varname>
+ value will be used. If set to 0, parallel vacuum is disabled for
+ this table. The default value is -1.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index ac5d083d468..38ee973ea05 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -81,7 +81,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
is not obtained. However, extra space is not returned to the operating
system (in most cases); it's just kept available for re-use within the
same table. It also allows us to leverage multiple CPUs in order to process
- indexes. This feature is known as <firstterm>parallel vacuum</firstterm>.
+ indexes. This feature is known as <firstterm><xref linkend="parallel-vacuum"/></firstterm>.
To disable this feature, one can use <literal>PARALLEL</literal> option and
specify parallel workers as zero. <command>VACUUM FULL</command> rewrites
the entire contents of the table into a new disk file with no extra space,
@@ -266,24 +266,9 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<term><literal>PARALLEL</literal></term>
<listitem>
<para>
- Perform index vacuum and index cleanup phases of <command>VACUUM</command>
- in parallel using <replaceable class="parameter">integer</replaceable>
- background workers (for the details of each vacuum phase, please
- refer to <xref linkend="vacuum-phases"/>). The number of workers used
- to perform the operation is equal to the number of indexes on the
- relation that support parallel vacuum which is limited by the number of
- workers specified with <literal>PARALLEL</literal> option if any which is
- further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
- An index can participate in parallel vacuum if and only if the size of the
- index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
- Please note that it is not guaranteed that the number of parallel workers
- specified in <replaceable class="parameter">integer</replaceable> will be
- used during execution. It is possible for a vacuum to run with fewer
- workers than specified, or even with no workers at all. Only one worker
- can be used per index. So parallel workers are launched only when there
- are at least <literal>2</literal> indexes in the table. Workers for
- vacuum are launched before the start of each phase and exit at the end of
- the phase. These behaviors might change in a future release. This
+ Specifies the maximum number of parallel workers that can be used
+ for <xref linkend="parallel-vacuum"/>, which is further limited
+ by <xref linkend="guc-max-parallel-maintenance-workers"/>. This
option can't be used with the <literal>FULL</literal> option.
</para>
</listitem>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b41eafd7691..3e832c3797e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -236,6 +236,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1969,6 +1978,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 88c71cd85b6..39395aed0d5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -152,6 +152,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/read_stream.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/pg_rusage.h"
#include "utils/timestamp.h"
@@ -862,6 +863,17 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params,
lazy_check_wraparound_failsafe(vacrel);
dead_items_alloc(vacrel, params->nworkers);
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * Used by tests to pause before parallel vacuum is launched, allowing
+ * test code to modify configuration that the leader then propagates to
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess() && ParallelVacuumIsActive(vacrel))
+ INJECTION_POINT("autovacuum-start-parallel-vacuum", NULL);
+#endif
+
/*
* Call lazy_scan_heap to perform all required heap pruning, index
* vacuuming, and heap vacuuming (plus related processing)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0ed363d1c85..bc3e57ad63c 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2435,8 +2435,20 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (IsParallelWorker())
+ {
+ /*
+ * Update cost-based vacuum delay parameters for a parallel autovacuum
+ * worker if any changes are detected. It might enable cost-based
+ * delay so it needs to be called before VacuumCostActive check.
+ */
+ parallel_vacuum_update_shared_delay_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2450,6 +2462,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * Propagate cost-based vacuum delay parameters to shared memory if
+ * any of them have changed during the config reload.
+ */
+ parallel_vacuum_propagate_shared_delay_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77834b96a21..979c2be4abd 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -16,6 +18,13 @@
* the parallel context is re-initialized so that the same DSM can be used for
* multiple passes of index bulk-deletion and index cleanup.
*
+ * For parallel autovacuum, we need to propagate cost-based vacuum delay
+ * parameters from the leader to its workers, as the leader's parameters can
+ * change even while processing a table (e.g., due to a config reload).
+ * The PVSharedCostParams struct manages these parameters using a
+ * generation counter. Each parallel worker polls this shared state and
+ * refreshes its local delay parameters whenever a change is detected.
+ *
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
@@ -51,6 +60,33 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */
+typedef struct PVSharedCostParams
+{
+ /*
+ * The generation counter is incremented by the leader process each time
+ * it updates the shared cost-based vacuum delay parameters. Parallel
+ * vacuum workers compare it with their local generation,
+ * shared_params_generation_local, to detect whether they need to refresh
+ * their local parameters. The generation starts from 1 so that a freshly
+ * started worker (whose local copy is 0) will always load the initial
+ * parameters on its first check.
+ */
+ pg_atomic_uint32 generation;
+
+ slock_t mutex; /* protects all fields below */
+
+ /* Parameters to share with parallel workers */
+ double cost_delay;
+ int cost_limit;
+ int cost_page_dirty;
+ int cost_page_hit;
+ int cost_page_miss;
+} PVSharedCostParams;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -120,6 +156,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenance VACUUM.
+ */
+ bool is_autovacuum;
+
+ /*
+ * Cost-based vacuum delay parameters shared between the autovacuum leader
+ * and its parallel workers.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -222,6 +270,17 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
+/*
+ * Worker-local copy of the last cost-parameter generation this worker has
+ * applied. Initialized to 0; since the leader initializes the shared
+ * generation counter to 1, the first call to
+ * parallel_vacuum_update_shared_delay_params() will always detect a
+ * mismatch and read the initial parameters from shared memory.
+ */
+static uint32 shared_params_generation_local = 0;
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -233,6 +292,8 @@ static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
static bool parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
bool vacuum);
static void parallel_vacuum_error_callback(void *arg);
+static inline void parallel_vacuum_set_cost_parameters(PVSharedCostParams *params);
+static void parallel_vacuum_dsm_detach(dsm_segment *seg, Datum arg);
/*
* Try to enter parallel mode and create a parallel context. Then initialize
@@ -374,8 +435,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -392,6 +454,22 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->is_autovacuum = AmAutoVacuumWorkerProcess();
+
+ /*
+ * Initialize shared cost-based vacuum delay parameters if it's for
+ * autovacuum.
+ */
+ if (shared->is_autovacuum)
+ {
+ parallel_vacuum_set_cost_parameters(&shared->cost_params);
+ pg_atomic_init_u32(&shared->cost_params.generation, 1);
+ SpinLockInit(&shared->cost_params.mutex);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ on_dsm_detach(pcxt->seg, parallel_vacuum_dsm_detach, (Datum) 0);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -457,10 +535,26 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
DestroyParallelContext(pvs->pcxt);
ExitParallelMode();
+ if (AmAutoVacuumWorkerProcess())
+ pv_shared_cost_params = NULL;
+
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
}
+/*
+ * DSM detach callback. This is invoked when an autovacuum worker detaches
+ * from the DSM segment holding PVShared. It ensures to reset the local pointer
+ * to the shared state even if paralell vacuum raises an error and doesn't
+ * call parallel_vacuum_end().
+ */
+static void
+parallel_vacuum_dsm_detach(dsm_segment *seg, Datum arg)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+ pv_shared_cost_params = NULL;
+}
+
/*
* Returns the dead items space and dead items information.
*/
@@ -534,6 +628,103 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wstats);
}
+/*
+ * Fill in the given structure with cost-based vacuum delay parameter values.
+ */
+static inline void
+parallel_vacuum_set_cost_parameters(PVSharedCostParams *params)
+{
+ params->cost_delay = vacuum_cost_delay;
+ params->cost_limit = vacuum_cost_limit;
+ params->cost_page_dirty = VacuumCostPageDirty;
+ params->cost_page_hit = VacuumCostPageHit;
+ params->cost_page_miss = VacuumCostPageMiss;
+}
+
+/*
+ * Updates the cost-based vacuum delay parameters for parallel autovacuum
+ * workers.
+ *
+ * For non-autovacuum parallel workers, this function will have no effect.
+ */
+void
+parallel_vacuum_update_shared_delay_params(void)
+{
+ uint32 params_generation;
+
+ Assert(IsParallelWorker());
+
+ /* Quick return if the worker is not running for the autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+ Assert(shared_params_generation_local <= params_generation);
+
+ /* Return if parameters had not changed in the leader */
+ if (params_generation == shared_params_generation_local)
+ return;
+
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ VacuumCostDelay = pv_shared_cost_params->cost_delay;
+ VacuumCostLimit = pv_shared_cost_params->cost_limit;
+ VacuumCostPageDirty = pv_shared_cost_params->cost_page_dirty;
+ VacuumCostPageHit = pv_shared_cost_params->cost_page_hit;
+ VacuumCostPageMiss = pv_shared_cost_params->cost_page_miss;
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ VacuumUpdateCosts();
+
+ shared_params_generation_local = params_generation;
+
+ elog(DEBUG2,
+ "parallel autovacuum worker updated cost params: cost_limit=%d, cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d, cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+}
+
+/*
+ * Store the cost-based vacuum delay parameters in the shared memory so that
+ * parallel vacuum workers can consume them (see
+ * parallel_vacuum_update_shared_delay_params()).
+ */
+void
+parallel_vacuum_propagate_shared_delay_params(void)
+{
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /*
+ * Quick return if the leader process is not sharing the delay parameters.
+ */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ /*
+ * Check if any delay parameters have changed. We can read them without
+ * locks as only the leader can modify them.
+ */
+ if (vacuum_cost_delay == pv_shared_cost_params->cost_delay &&
+ vacuum_cost_limit == pv_shared_cost_params->cost_limit &&
+ VacuumCostPageDirty == pv_shared_cost_params->cost_page_dirty &&
+ VacuumCostPageHit == pv_shared_cost_params->cost_page_hit &&
+ VacuumCostPageMiss == pv_shared_cost_params->cost_page_miss)
+ return;
+
+ /* Update the shared delay parameters */
+ SpinLockAcquire(&pv_shared_cost_params->mutex);
+ parallel_vacuum_set_cost_parameters(pv_shared_cost_params);
+ SpinLockRelease(&pv_shared_cost_params->mutex);
+
+ /*
+ * Increment the generation of the parameters, i.e. let parallel workers
+ * know that they should re-read shared cost params.
+ */
+ pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -555,12 +746,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -599,8 +795,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -1064,7 +1260,21 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
shared->dead_items_handle);
/* Set cost-based vacuum delay */
- VacuumUpdateCosts();
+ if (shared->is_autovacuum)
+ {
+ /*
+ * Parallel autovacuum workers initialize cost-based delay parameters
+ * from the leader's shared state rather than GUC defaults, because
+ * the leader may have applied per-table or autovacuum-specific
+ * overrides. pv_shared_cost_params must be set before calling
+ * parallel_vacuum_update_shared_delay_params().
+ */
+ pv_shared_cost_params = &(shared->cost_params);
+ parallel_vacuum_update_shared_delay_params();
+ }
+ else
+ VacuumUpdateCosts();
+
VacuumCostBalance = 0;
VacuumCostBalanceLocal = 0;
VacuumSharedCostBalance = &(shared->cost_balance);
@@ -1119,6 +1329,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
vac_close_indexes(nindexes, indrels, RowExclusiveLock);
table_close(rel, ShareUpdateExclusiveLock);
FreeAccessStrategy(pvs.bstrategy);
+
+ if (shared->is_autovacuum)
+ pv_shared_cost_params = NULL;
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 8400e6722cc..592b37ed260 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1689,7 +1689,7 @@ VacuumUpdateCosts(void)
}
else
{
- /* Must be explicit VACUUM or ANALYZE */
+ /* Must be explicit VACUUM or ANALYZE or parallel autovacuum worker */
vacuum_cost_delay = VacuumCostDelay;
vacuum_cost_limit = VacuumCostLimit;
}
@@ -2931,8 +2931,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -2942,6 +2940,27 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.log_analyze_min_duration = log_analyze_min_duration;
tab->at_params.toast_parent = InvalidOid;
+ /* Determine the number of parallel vacuum workers to use */
+ tab->at_params.nworkers = 0;
+ if (avopts)
+ {
+ if (avopts->autovacuum_parallel_workers == 0)
+ {
+ /*
+ * Disable parallel vacuum, if the reloption sets the parallel
+ * degree as zero.
+ */
+ tab->at_params.nworkers = -1;
+ }
+ else if (avopts->autovacuum_parallel_workers > 0)
+ tab->at_params.nworkers = avopts->autovacuum_parallel_workers;
+
+ /*
+ * autovacuum_parallel_workers == -1 falls through, keep
+ * nworkers=0
+ */
+ }
+
/*
* Later, in vacuum_rel(), we check reloptions for any
* vacuum_max_eager_freeze_failure_rate override.
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e1546d9c97a..c4c3fbc4fe3 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3358,9 +3358,15 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception because cost-based
+ * delays need to be affected to parallel autovacuum workers. These
+ * parameters are propagated to its workers during parallel vacuum (see
+ * vacuumparallel.c for details). All other changes will affect only the
+ * parallel autovacuum leader.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a315c4ab8ab..fae3ec30a3c 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -170,6 +170,14 @@
max => '10.0',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel workers that can be used by a single autovacuum worker.',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_PARALLEL_WORKER_LIMIT',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 6d0337853e0..d902d629508 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 53bf1e21721..1d941c11997 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1432,6 +1432,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5b8023616c0..69fec07491b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -422,6 +422,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkerStats *wstats);
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_shared_delay_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 7277c37e779..2e10e3c814d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 236830f6b93..cd1e92f2302 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ int autovacuum_parallel_workers;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 864b407abcf..70fac2b31b9 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e5acacd5083..6b2b1391fa9 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..15e83010c1c
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for autovacuum"
+
+TAP_TESTS = 1
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..86e392bc0de
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_parallel_autovacuum.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
new file mode 100644
index 00000000000..c5a2e78246a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -0,0 +1,189 @@
+
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+# Test parallel autovacuum behavior
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+# Before each test we should disable autovacuum for 'test_autovac' table and
+# generate some dead tuples in it. Returns the current autovacuum_count of
+# the table test_autovac.
+sub prepare_for_next_test
+{
+ my ($node, $test_number) = @_;
+
+ $node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+ UPDATE test_autovac SET col_1 = $test_number;
+ });
+
+ my $count = $node->safe_psql(
+ 'postgres', qq{
+ SELECT autovacuum_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+
+ return $count;
+}
+
+# Wait for the table to be vacuumed by an autovacuum worker.
+sub wait_for_autovacuum_complete
+{
+ my ($node, $old_count) = @_;
+
+ $node->poll_query_until(
+ 'postgres', qq{
+ SELECT autovacuum_count > $old_count FROM pg_stat_user_tables WHERE relname = 'test_autovac'
+ });
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+
+# Limit to one autovacuum worker and disable autovacuum logging globally
+# (enabled only on the test table) so that log checks below match only
+# activity on the expected table.
+$node->append_conf(
+ 'postgresql.conf', qq{
+autovacuum_max_workers = 1
+autovacuum_worker_slots = 1
+autovacuum_max_parallel_workers = 2
+max_worker_processes = 10
+max_parallel_workers = 10
+log_min_messages = debug2
+autovacuum_naptime = '1s'
+min_parallel_index_scan_size = 0
+log_autovacuum_min_duration = -1
+});
+$node->start;
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create all functions needed for testing
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE EXTENSION injection_points;
+});
+
+my $indexes_num = 3;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table and fill it with some data
+$node->safe_psql(
+ 'postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ log_autovacuum_min_duration = 0);
+
+ INSERT INTO test_autovac
+ SELECT
+ g AS col1,
+ g + 1 AS col2,
+ g + 2 AS col3,
+ g + 3 AS col4
+ FROM generate_series(1, $initial_rows_num) AS g;
+});
+
+# Create specified number of b-tree indexes on the table
+$node->safe_psql(
+ 'postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can do it.
+
+my $av_count = prepare_for_next_test($node, 1);
+my $log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+wait_for_autovacuum_complete($node, $av_count);
+ok( $node->log_contains(
+ qr/parallel workers: index vacuum: 2 planned, 2 launched in total/,
+ $log_offset));
+
+# Test 2:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to the parallel workers.
+
+$av_count = prepare_for_next_test($node, 2);
+$log_offset = -s $node->logfile;
+
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_attach('autovacuum-start-parallel-vacuum', 'wait');
+
+ ALTER TABLE test_autovac SET (autovacuum_parallel_workers = 1, autovacuum_enabled = true);
+});
+
+# Wait until parallel autovacuum is inited
+$node->wait_for_event('autovacuum worker',
+ 'autovacuum-start-parallel-vacuum');
+
+# Update the shared cost-based delay parameters.
+$node->safe_psql(
+ 'postgres', qq{
+ ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 500;
+ ALTER SYSTEM SET autovacuum_vacuum_cost_delay = 5;
+ ALTER SYSTEM SET vacuum_cost_page_miss = 10;
+ ALTER SYSTEM SET vacuum_cost_page_dirty = 10;
+ ALTER SYSTEM SET vacuum_cost_page_hit = 10;
+ SELECT pg_reload_conf();
+});
+
+# Resume the leader process to update the shared parameters during heap scan (i.e.
+# vacuum_delay_point() is called) and launch a parallel vacuum worker, but it stops
+# before vacuuming indexes due to the injection point.
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_wakeup('autovacuum-start-parallel-vacuum');
+});
+
+# Check whether parallel worker successfully updated all parameters during
+# index processing.
+$node->wait_for_log(
+ qr/parallel autovacuum worker updated cost params: cost_limit=500, cost_delay=5, cost_page_miss=10, cost_page_dirty=10, cost_page_hit=10/,
+ $log_offset);
+
+wait_for_autovacuum_complete($node, $av_count);
+
+# Cleanup
+$node->safe_psql(
+ 'postgres', qq{
+ SELECT injection_points_detach('autovacuum-start-parallel-vacuum');
+});
+
+$node->stop;
+done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c72f6c59573..a64faa32682 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2109,6 +2109,7 @@ PVIndStats
PVIndVacStatus
PVOID
PVShared
+PVSharedCostParams
PVWorkerUsage
PVWorkerStats
PX_Alias
--
2.53.0
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-04 08:37 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-04-04 08:37 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Sat, Apr 4, 2026 at 8:12 AM Masahiko Sawada <[email protected]> wrote:
>
> Regarding the regression tests, ISTM we no longer need
> 'autovacuum-leader-before-indexes-processing' injection point since it
> currently tests that parallel workers update their delay parameters
> during the initialization (i.e., in parallel_vacuum_main()). In order
> to verify the behavior of workers updating their delay parameters
> while processing indexes, we would need another injection ponit to
> stop parallel workers, which seems overkill to me. So I removed it but
> the test still covers the propagation logic.
>
> Regarding the patch, I don't think it's a good idea to include
> bgworker_internals.h from reloptions.c:
>
> I'd leave the maximum value as 1024.
OK, let's leave it.
>
> I've attached patch and please check it. I think it's a good shape and
> I'm going to push it next Monday barrying objections.
>
Thank you for updating the patch!
All changes look good to me.
BTW, what about the "opt-in vs. opt-out style" issue?
As I wrote here [1], we can consider a new approach - allow the user to set the
autovacuum_max_workers reloption even if GUC parameter is zero.
I think it can satisfy all possible use cases.
[1] https://www.postgresql.org/message-id/CAJDiXggvE%3De%3D0%2BHnZ1XjwUcXYTb0dw77pRUts5gqY997YaxVjQ%40ma...
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-07 07:48 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
1 sibling, 0 replies; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-07 07:48 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Alexander Korotkov <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Wed, Apr 1, 2026 at 12:44 AM Daniil Davydov <[email protected]> wrote:
> >
> > > He confirmed that as a rule there are *hundreds of thousands* of tables in the
> > > system, the vast majority of which do not need to be vacuumed in parallel mode.
> >
> > I'm still struggling to see the technical justification; why would a
> > user want to avoid parallel vacuuming on eligible tables if they have
> > already explicitly allowed the system to use more resources by setting
> > autovacuum_max_parallel_workers to >0?
>
> Here I am talking about "introductory data". I.e. the situation that the user
> has before considering our parameter usage. Based on this situation, it seems
> to me that not everyone will want to turn on parallel a/v (because of resource
> shortage hazard).
>
> > If resource contention occurs,
> > it is typically a sign that the global parameters need re-tuning. As I
> > mentioned, the same contention can occur even with an opt-in style if
> > multiple tables are manually configured.
> >
>
> Yep, we already discussed it and I agree with you. I think that in the case of
> opt-in style the resource contention will be much more controlled. But actually
> the opt-in style in the form in which I originally proposed it, no longer seems
> like a good idea to me. Classic opt-in style will deprive us of support for
> half of the parallel a/v use cases. Anton's proposal seems to me like a good
> balance between the two styles.
>
> > > He also suggested the following : let the reloption overlap the value of the
> > > GUC parameter. I.e. even if av_max_parallel_workers parameters is 0 the user
> > > still can set the av_parallel_workers to 10 for some table, and autovacuum
> > > will process this table in parallel.
> > >
> > > I remember that you want to use the GUC parameter as a global switch, and this
> > > approach will break this logic. But according to Anton's words, it is okay if
> > > the GUC parameter cannot disable parallel a/v for all tables instantly. It will
> > > become an administrator's responsibility to manually turn off parallel a/v for
> > > several tables (again, it is completely OK). Thus, this feature can be handy
> > > for all use cases.
> >
> > While some autovacuum parameters do override GUCs, those are typically
> > local to the process (like cost delay). Parallel workers, however, are
> > a shared system-wide resource. In a multi-tenant environment, allowing
> > a single table's reloption to bypass the
> > autovacuum_max_parallel_workers = 0 limit could lead to unexpected
> > exhaustion of the worker pool.
>
> Will this exhaustion really be unexpected? If we describe such an ability in
> the documentation, and the user uses it, then everything is fair. Even if
> administrator forgets that he enabled av_parallel_workers reloption somewhere,
> then he can :
How can DBAs prevent parallel workers from being exhaustly used if
users set a high value to the reloption?
> 1)
> Check the logfile (if log level is not too high) searching for logs like
> "parallel workers: index vacuum: N planned, N launched in total".
> 2)
> Run a query that selects all tables which have av_parallel_workers > 0.
Does that mean DBAs would need to run these queries periodically? I
don't think that in a multi-tenant environment, DBAs can (or should)
execute ALTER TABLE on user-owned tables just to fix resource issues.
>
> >I think that this GUC should act as a
> > reliable global switch for resource management.
>
> I agree that the "global switch" is an attractive idea and we should strive
> for it. But our parameter *can* play the role of the switch if users don't
> manually touch the av_parallel_workers reloption. But if they do - well, it is
> their responsibility to turn the reloption off.
>
> >
> > > I hope it doesn't look like as an adapting to the needs of a specific user.
> > > A lot of super-large productions are migrating to postgres now, and I believe
> > > that we should ensure their comfort too.
> >
> > I'm not prioritizing one specific use case over another. I believe
> > that there are also users who want to use parallel vacuum on hundreds
> > of thousands of tables. We should consider a better solution while
> > checking it from multiple perspectives such as the usability, the
> > robustness and consistency with the existing features and behaviors
> > etc.
>
> For those users who want to use parallel a/v for hundreds of thousands of
> tables we have the default value "-1" which allows parallel a/v everywhere via
> GUC parameter manipulation.
>
> For those users who want to parallel a/v on several specific tables we can
> allow setting reloption that will override the GUC.
>
> I guess that the question is : "Is it normal if the GUC parameter will lose
> ability to turn off parallel a/v everywhere after the user has manually raised
> the value for the av_parallel_workers reloption on a few tables?". If the
> answer is "Yes", I don't see any obstacles for us to allow overriding the GUC
> parameter via reloption.
I think the answer is no, particularly for this parameter. Since it
controls a system-wide shared resource, it should be capped by a GUC
to ensure centralized management, consistent with other
parallel-query-related GUCs and reloptions.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-07 07:49 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-07 07:49 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Sat, Apr 4, 2026 at 1:38 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Sat, Apr 4, 2026 at 8:12 AM Masahiko Sawada <[email protected]> wrote:
> >
> > Regarding the regression tests, ISTM we no longer need
> > 'autovacuum-leader-before-indexes-processing' injection point since it
> > currently tests that parallel workers update their delay parameters
> > during the initialization (i.e., in parallel_vacuum_main()). In order
> > to verify the behavior of workers updating their delay parameters
> > while processing indexes, we would need another injection ponit to
> > stop parallel workers, which seems overkill to me. So I removed it but
> > the test still covers the propagation logic.
> >
> > Regarding the patch, I don't think it's a good idea to include
> > bgworker_internals.h from reloptions.c:
> >
> > I'd leave the maximum value as 1024.
>
> OK, let's leave it.
>
> >
> > I've attached patch and please check it. I think it's a good shape and
> > I'm going to push it next Monday barrying objections.
> >
>
> Thank you for updating the patch!
> All changes look good to me.
Thank you! Pushed.
> BTW, what about the "opt-in vs. opt-out style" issue?
> As I wrote here [1], we can consider a new approach - allow the user to set the
> autovacuum_max_workers reloption even if GUC parameter is zero.
> I think it can satisfy all possible use cases.
I've just replied to the email. Please check it[1].
Regards,
[1] https://www.postgresql.org/message-id/CAD21AoDEfe5-tYSqa%3DMGLP5TX5QH2irVZVyULCeTQj0J92Hp1A%40mail.g...
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-07 13:32 Daniil Davydov <[email protected]>
parent: Masahiko Sawada <[email protected]>
0 siblings, 1 reply; 112+ messages in thread
From: Daniil Davydov @ 2026-04-07 13:32 UTC (permalink / raw)
To: Masahiko Sawada <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
Hi,
On Tue, Apr 7, 2026 at 10:49 AM Masahiko Sawada <[email protected]> wrote:
>
> > >
> > > While some autovacuum parameters do override GUCs, those are typically
> > > local to the process (like cost delay). Parallel workers, however, are
> > > a shared system-wide resource. In a multi-tenant environment, allowing
> > > a single table's reloption to bypass the
> > > autovacuum_max_parallel_workers = 0 limit could lead to unexpected
> > > exhaustion of the worker pool.
> >
> > Will this exhaustion really be unexpected? If we describe such an ability in
> > the documentation, and the user uses it, then everything is fair. Even if
> > administrator forgets that he enabled av_parallel_workers reloption somewhere,
> > then he can :
>
> How can DBAs prevent parallel workers from being exhaustly used if
> users set a high value to the reloption?
>
Only manual control. Since DBA increased reloption manually, it is OK to ask
him to manually decrease it.
> > 1)
> > Check the logfile (if log level is not too high) searching for logs like
> > "parallel workers: index vacuum: N planned, N launched in total".
> > 2)
> > Run a query that selects all tables which have av_parallel_workers > 0.
>
> Does that mean DBAs would need to run these queries periodically?
Not really. I say that even if DBA has lost control on the parallel a/v
workers, it has an ability to find these bottlenecks.
> I don't think that in a multi-tenant environment, DBAs can (or should)
> execute ALTER TABLE on user-owned tables just to fix resource issues.
>
Well, the people I talked to had a different opinion which is based on clients
feedback : what is acceptable and what is not. I don't think that we can
convince each other, so let it be as it is :)
But if you don't mind continuing to discuss this topic (perhaps with the
involvement of other people), I would love to create a new thread for it.
> > I guess that the question is : "Is it normal if the GUC parameter will lose
> > ability to turn off parallel a/v everywhere after the user has manually raised
> > the value for the av_parallel_workers reloption on a few tables?". If the
> > answer is "Yes", I don't see any obstacles for us to allow overriding the GUC
> > parameter via reloption.
>
> I think the answer is no, particularly for this parameter. Since it
> controls a system-wide shared resource, it should be capped by a GUC
> to ensure centralized management, consistent with other
> parallel-query-related GUCs and reloptions.
OK. I believe that "global switch" will also be pretty handy for many use cases.
> Thank you! Pushed.
Great news! Thank you very much for your help, Masahiko-san!
--
Best regards,
Daniil Davydov
^ permalink raw reply [nested|flat] 112+ messages in thread
* Re: POC: Parallel processing of indexes in autovacuum
@ 2026-04-09 18:37 Masahiko Sawada <[email protected]>
parent: Daniil Davydov <[email protected]>
0 siblings, 0 replies; 112+ messages in thread
From: Masahiko Sawada @ 2026-04-09 18:37 UTC (permalink / raw)
To: Daniil Davydov <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Bharath Rupireddy <[email protected]>; Sami Imseih <[email protected]>; Matheus Alcantara <[email protected]>; Maxim Orlov <[email protected]>; Postgres hackers <[email protected]>
On Tue, Apr 7, 2026 at 6:32 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Tue, Apr 7, 2026 at 10:49 AM Masahiko Sawada <[email protected]> wrote:
> >
> > > >
> > > > While some autovacuum parameters do override GUCs, those are typically
> > > > local to the process (like cost delay). Parallel workers, however, are
> > > > a shared system-wide resource. In a multi-tenant environment, allowing
> > > > a single table's reloption to bypass the
> > > > autovacuum_max_parallel_workers = 0 limit could lead to unexpected
> > > > exhaustion of the worker pool.
> > >
> > > Will this exhaustion really be unexpected? If we describe such an ability in
> > > the documentation, and the user uses it, then everything is fair. Even if
> > > administrator forgets that he enabled av_parallel_workers reloption somewhere,
> > > then he can :
> >
> > How can DBAs prevent parallel workers from being exhaustly used if
> > users set a high value to the reloption?
> >
>
> Only manual control. Since DBA increased reloption manually, it is OK to ask
> him to manually decrease it.
In multi-tenant environments, the roles of table owners and DBAs are
often separated. Tenants can freely set reloptions via ALTER TABLE,
but a DBA cannot easily revert those settings on user-owned tables.
Even if a DBA tries to use ALTER TABLE to fix a misconfigured
reloption, it would cancel any currently running autovacuum on that
table. Furthermore, if the table is undergoing an anti-wraparound
vacuum, the ALTER TABLE command itself will be blocked, making it
impossible to resolve a resource crisis quickly. If a single tenant
could exhaust the entire parallel worker pool by setting a high
reloption value, the DBA would have no effective way to prevent or
mitigate it under an override model.
While I understand the use case for enabling parallel vacuum only on
specific tables, this is already achievable under the cap model (by
setting a global GUC > 0 and using the reloption to disable it on
others), even if the initial configuration is more tedious.
Also, I'm concerned that the override behavior would be inconsistent
with other parallel-query-related features. While some autovacuum
reloptions (like autovacuum_vacuum_scale_factor) do override GUCs,
those parameters only affect the local behavior of that specific
process and do not impact shared system-wide resources. In contrast,
autovacuum_max_parallel_workers takes workers from the
max_parallel_workers pool. Allowing a single table to monopolize this
shared pool by bypassing the GUC cap creates a significant risk that
cannot be easily managed.
While this isn't exclusively about multi-tenancy, I think that we
cannot simply introduce a behavior that creates such a high risk for
system-wide resource exhaustion.
>
> > > 1)
> > > Check the logfile (if log level is not too high) searching for logs like
> > > "parallel workers: index vacuum: N planned, N launched in total".
> > > 2)
> > > Run a query that selects all tables which have av_parallel_workers > 0.
> >
> > Does that mean DBAs would need to run these queries periodically?
>
> Not really. I say that even if DBA has lost control on the parallel a/v
> workers, it has an ability to find these bottlenecks.
>
> > I don't think that in a multi-tenant environment, DBAs can (or should)
> > execute ALTER TABLE on user-owned tables just to fix resource issues.
> >
>
> Well, the people I talked to had a different opinion which is based on clients
> feedback : what is acceptable and what is not. I don't think that we can
> convince each other, so let it be as it is :)
>
> But if you don't mind continuing to discuss this topic (perhaps with the
> involvement of other people), I would love to create a new thread for it.
Okay.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 112+ messages in thread
end of thread, other threads:[~2026-04-09 18:37 UTC | newest]
Thread overview: 112+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-05-01 01:02 Re: POC: Parallel processing of indexes in autovacuum Masahiko Sawada <[email protected]>
2025-05-02 16:58 ` Sami Imseih <[email protected]>
2025-05-02 18:49 ` Daniil Davydov <[email protected]>
2025-05-02 20:17 ` Sami Imseih <[email protected]>
2025-05-03 07:32 ` Daniil Davydov <[email protected]>
2025-05-02 22:06 ` Masahiko Sawada <[email protected]>
2025-05-02 18:12 ` Daniil Davydov <[email protected]>
2025-05-02 22:27 ` Masahiko Sawada <[email protected]>
2025-05-02 22:59 ` Sami Imseih <[email protected]>
2025-05-03 08:17 ` Daniil Davydov <[email protected]>
2025-05-03 08:10 ` Daniil Davydov <[email protected]>
2025-05-05 23:56 ` Masahiko Sawada <[email protected]>
2025-05-06 00:21 ` Sami Imseih <[email protected]>
2025-05-06 05:15 ` Masahiko Sawada <[email protected]>
2025-05-06 20:11 ` Sami Imseih <[email protected]>
2025-05-09 18:33 ` Daniil Davydov <[email protected]>
2025-05-15 21:06 ` Matheus Alcantara <[email protected]>
2025-05-16 05:10 ` Daniil Davydov <[email protected]>
2025-05-20 22:30 ` Masahiko Sawada <[email protected]>
2025-05-22 07:43 ` Daniil Davydov <[email protected]>
2025-05-22 17:48 ` Sami Imseih <[email protected]>
2025-05-22 23:20 ` Masahiko Sawada <[email protected]>
2025-05-22 23:12 ` Masahiko Sawada <[email protected]>
2025-05-25 17:22 ` Daniil Davydov <[email protected]>
2025-06-17 22:36 ` Masahiko Sawada <[email protected]>
2025-06-18 08:03 ` Daniil Davydov <[email protected]>
2025-07-04 14:21 ` Matheus Alcantara <[email protected]>
2025-07-06 08:00 ` Daniil Davydov <[email protected]>
2025-07-08 15:20 ` Matheus Alcantara <[email protected]>
2025-07-09 05:26 ` Daniil Davydov <[email protected]>
2025-07-14 07:09 ` Masahiko Sawada <[email protected]>
2025-07-14 10:49 ` Daniil Davydov <[email protected]>
2025-07-17 19:42 ` Masahiko Sawada <[email protected]>
2025-07-20 16:43 ` Daniil Davydov <[email protected]>
2025-07-21 16:40 ` Sami Imseih <[email protected]>
2025-07-22 06:45 ` Daniil Davydov <[email protected]>
2025-08-07 23:38 ` Masahiko Sawada <[email protected]>
2025-08-14 20:40 ` Masahiko Sawada <[email protected]>
2025-08-18 08:30 ` Daniil Davydov <[email protected]>
2025-08-18 21:03 ` Masahiko Sawada <[email protected]>
2025-09-15 18:50 ` Alexander Korotkov <[email protected]>
2025-09-16 18:30 ` Masahiko Sawada <[email protected]>
2025-10-28 13:09 ` Daniil Davydov <[email protected]>
2025-10-31 07:54 ` Daniil Davydov <[email protected]>
2025-10-31 20:03 ` Masahiko Sawada <[email protected]>
2025-11-20 19:31 ` Sami Imseih <[email protected]>
2025-11-22 20:13 ` Daniil Davydov <[email protected]>
2025-11-22 22:51 ` Sami Imseih <[email protected]>
2025-11-23 15:02 ` Daniil Davydov <[email protected]>
2026-01-05 18:51 ` Masahiko Sawada <[email protected]>
2026-01-05 20:44 ` Daniil Davydov <[email protected]>
2026-01-07 09:51 ` Daniil Davydov <[email protected]>
2026-01-07 13:51 ` =?ISO-8859-1?B?emVuZ21hbg==?= <[email protected]>
2026-01-07 20:52 ` Daniil Davydov <[email protected]>
2026-01-15 02:13 ` Masahiko Sawada <[email protected]>
2026-01-16 14:10 ` Daniil Davydov <[email protected]>
2026-01-16 22:20 ` Masahiko Sawada <[email protected]>
2026-01-17 14:52 ` Daniil Davydov <[email protected]>
2026-01-21 22:22 ` Sami Imseih <[email protected]>
2026-01-21 22:28 ` Masahiko Sawada <[email protected]>
2026-02-10 15:03 ` Daniil Davydov <[email protected]>
2026-02-25 23:59 ` Masahiko Sawada <[email protected]>
2026-02-27 13:49 ` Daniil Davydov <[email protected]>
2026-02-28 01:56 ` Masahiko Sawada <[email protected]>
2026-03-01 14:46 ` Daniil Davydov <[email protected]>
2026-03-02 22:25 ` Masahiko Sawada <[email protected]>
2026-03-04 06:58 ` Daniil Davydov <[email protected]>
2026-03-10 18:13 ` Masahiko Sawada <[email protected]>
2026-03-11 11:28 ` Daniil Davydov <[email protected]>
2026-03-11 19:05 ` Masahiko Sawada <[email protected]>
2026-03-16 12:33 ` Daniil Davydov <[email protected]>
2026-03-16 16:46 ` Masahiko Sawada <[email protected]>
2026-03-16 20:54 ` Daniil Davydov <[email protected]>
2026-03-17 16:50 ` Masahiko Sawada <[email protected]>
2026-03-18 09:23 ` Daniil Davydov <[email protected]>
2026-03-18 19:49 ` Masahiko Sawada <[email protected]>
2026-03-19 14:28 ` Daniil Davydov <[email protected]>
2026-03-19 23:58 ` Masahiko Sawada <[email protected]>
2026-03-25 07:45 ` Daniil Davydov <[email protected]>
2026-03-25 22:42 ` Masahiko Sawada <[email protected]>
2026-03-27 03:54 ` Bharath Rupireddy <[email protected]>
2026-03-28 11:10 ` Daniil Davydov <[email protected]>
2026-03-30 00:17 ` SATYANARAYANA NARLAPURAM <[email protected]>
2026-03-30 08:44 ` Daniil Davydov <[email protected]>
2026-03-30 10:40 ` Daniil Davydov <[email protected]>
2026-03-31 07:09 ` Masahiko Sawada <[email protected]>
2026-03-31 14:18 ` Daniil Davydov <[email protected]>
2026-03-31 21:19 ` Masahiko Sawada <[email protected]>
2026-04-01 07:44 ` Daniil Davydov <[email protected]>
2026-04-01 12:10 ` Alexander Korotkov <[email protected]>
2026-04-01 21:24 ` Daniil Davydov <[email protected]>
2026-04-01 23:15 ` Masahiko Sawada <[email protected]>
2026-04-02 15:10 ` Daniil Davydov <[email protected]>
2026-04-02 23:00 ` Masahiko Sawada <[email protected]>
2026-04-03 13:45 ` Daniil Davydov <[email protected]>
2026-04-04 01:11 ` Masahiko Sawada <[email protected]>
2026-04-04 08:37 ` Daniil Davydov <[email protected]>
2026-04-07 07:49 ` Masahiko Sawada <[email protected]>
2026-04-07 13:32 ` Daniil Davydov <[email protected]>
2026-04-09 18:37 ` Masahiko Sawada <[email protected]>
2026-04-07 07:48 ` Masahiko Sawada <[email protected]>
2026-03-31 00:14 ` SATYANARAYANA NARLAPURAM <[email protected]>
2026-04-01 18:54 ` Masahiko Sawada <[email protected]>
2026-04-01 21:43 ` Daniil Davydov <[email protected]>
2026-04-02 09:22 ` Masahiko Sawada <[email protected]>
2026-04-02 11:02 ` Alexander Korotkov <[email protected]>
2026-04-02 23:30 ` Masahiko Sawada <[email protected]>
2026-04-03 11:43 ` Daniil Davydov <[email protected]>
2026-03-31 07:46 ` SATYANARAYANA NARLAPURAM <[email protected]>
2026-03-31 13:26 ` Daniil Davydov <[email protected]>
2025-05-06 05:16 ` Daniil Davydov <[email protected]>
2025-05-06 04:54 ` Daniil Davydov <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox