public inbox for [email protected]help / color / mirror / Atom feed
Introduce XID age based replication slot invalidation 31+ messages / 6 participants [nested] [flat]
* Introduce XID age based replication slot invalidation @ 2025-09-18 17:20 John H <[email protected]> 0 siblings, 2 replies; 31+ messages in thread From: John H @ 2025-09-18 17:20 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; pgsql-hackers Hi folks, I'd like to restart the discussion about providing an xid-based slot invalidation mechanism. The previous effort [1] presented an XID and time-based invalidation and the inactive time-based approach was implemented first. The latest XID based patch from Bharath Rupireddy can be found here [2]. When thinking about availability of the database, inactive replication slots cause two main pain points: 1) WAL accumulation 2) Replication slots with xmin/catalog_xmin can hold back vacuuming leading to wrap-around The first issue can be mitigated by 'max_slot_wal_keep_size'. However in the second case there are no good mechanisms to prioritize write availability of the database and avoid wraparound. The new GUC 'idle_replication_slot_timeout' partially addresses the concern if you have similar workloads. However it's hard to set the same setting across a fleet of different applications. It's easy to imagine a high-XID churning workload in one cluster while another has large batch jobs where changes get synced out periodically. There isn't a "one-size" fits all setting for 'idle_replication_slot_timeout' in these two cases. The attached patch addresses this by introducing 'max_slot_xid_age' in a similar fashion. Replication slots with transaction ID greater than the set age will get invalidated allowing vacuum to proceed, biasing towards database availability. Invalidation happens in CHECKPOINT, similar to 'idle_replication_slot_timeout', and when VACUUM occurs. The patch currently attempts to invalidate once-per-autovacuum worker. We're wondering if it should attempt invalidation on a per-relation basis within the vacuum call itself. That would account for scenarios where the cost_delay or naptime is high between autovac executions. Thanks, John H [1] https://www.postgresql.org/message-id/flat/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe%2Baw%40m... [2] https://www.postgresql.org/message-id/flat/CALj2ACXe8%2BxSNdMXTMaSRWUwX7v61Ad4iddUwnn%3DdjSwx3GLLg%4... -- John Hsu - Amazon Web Services Attachments: [application/octet-stream] 0044-Add-XID-age-based-replication-slot-invalidation.patch (23.2K, 2-0044-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From cd9cb104041800810d38e21e31d207311f112228 Mon Sep 17 00:00:00 2001 From: John Hsu <[email protected]> Date: Fri, 8 Aug 2025 19:48:58 +0000 Subject: [PATCH] Add XID age based replication slot invalidation This commit introduces max_slot_xid_age GUC that allows replication slots whose xmin or catalog_xmin has reached the age specified by this setting to be invalidated. Idle or forgotten replication slots can hold back vacuum operations leading to bloat or transaction XID wrap around, requiring the slot to be dropped and requiring single user-mode vacuuming. This setting avoids these scenarios by proactively invalidating these stale slots on an XID basis. Invalidation checks happens at various locations to prevent wrap-around: - During CHECKPOINT - During vacuum (including autovacuum) This change makes it easy for administrators to protect against wrap-around concerns due to slots that are falling behind. Author: Bharath Rupireddy Author: John Hsu --- doc/src/sgml/config.sgml | 26 ++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/transam/xlog.c | 4 +- src/backend/commands/vacuum.c | 70 ++++- src/backend/replication/slot.c | 80 +++++- src/backend/utils/misc/guc_parameters.dat | 8 + src/include/replication/slot.h | 7 +- .../t/049_invalidate_xid_aged_slots.pl | 240 ++++++++++++++++++ 8 files changed, 436 insertions(+), 7 deletions(-) create mode 100644 src/test/recovery/t/049_invalidate_xid_aged_slots.pl diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e9b420f3ddb..eb07bcd3551 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4653,6 +4653,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <literal>xmin</literal> (the oldest + transaction that this slot needs the database to retain) or + <literal>catalog_xmin</literal> (the oldest transaction affecting the + system catalogs that this slot needs the database to retain) has reached + the age specified by this setting. A value of zero (which is default) + disables this feature. Users can set this value anywhere from zero to + two billion. This parameter can only be set in the + <filename>postgresql.conf</filename> file or on the server command + line. + </para> + + <para> + This invalidation check happens either when the slot is acquired + for use or during vacuum or during checkpoint. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 4187191ea74..beeb22a7da4 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3007,6 +3007,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 0baf0ac6160..41a48389afb 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7347,7 +7347,7 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | RS_INVAL_XID_AGE, _logSegNo, InvalidOid, InvalidTransactionId)) { @@ -7801,7 +7801,7 @@ CreateRestartPoint(int flags) replayPtr = GetXLogReplayRecPtr(&replayTLI); endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr; KeepLogSeg(endptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | RS_INVAL_XID_AGE, _logSegNo, InvalidOid, InvalidTransactionId)) { diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 733ef40ae7c..91cc08069c8 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -47,6 +47,7 @@ #include "postmaster/autovacuum.h" #include "postmaster/bgworker_internals.h" #include "postmaster/interrupt.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/lmgr.h" #include "storage/pmsignal.h" @@ -129,6 +130,7 @@ static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams params, static double compute_parallel_delay(void); static VacOptValue get_vacoptval_from_boolean(DefElem *def); static bool vac_tid_reaped(ItemPointer itemptr, void *state); +static void try_replication_slot_invalidation(void); /* * GUC check function to ensure GUC value specified is within the allowable @@ -471,6 +473,56 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) MemoryContextDelete(vac_context); } +/* + * Try invalidating replication slots based on current replication slot xmin + * limits once every vacuum cycle. + */ +static void +try_replication_slot_invalidation(void) +{ + + TransactionId min_slot_xmin; + TransactionId min_slot_catalog_xmin; + bool can_invalidate = false; + TransactionId cutoff; + TransactionId curr; + + if (max_slot_xid_age == 0) + return; + + curr = ReadNextTransactionId(); + + /* + * Calculate oldest XID a slot's xmin or catalog_xmin can have before + * they are invalidated. + */ + cutoff = curr - max_slot_xid_age; + + if (!TransactionIdIsNormal(cutoff)) + cutoff = FirstNormalTransactionId; + + ProcArrayGetReplicationSlotXmin(&min_slot_xmin, &min_slot_catalog_xmin); + + if (TransactionIdIsNormal(min_slot_xmin) && + TransactionIdPrecedesOrEquals(min_slot_xmin, cutoff)) + can_invalidate = true; + else if (TransactionIdIsNormal(min_slot_catalog_xmin) && + TransactionIdPrecedesOrEquals(min_slot_catalog_xmin, cutoff)) + can_invalidate = true; + + if (can_invalidate) + { + /* + * Note that InvalidateObsoleteReplicationSlots is also called as part + * of CHECKPOINT, and emitting ERRORs from within is avoided already. + * Therefore, there is no concern here that any ERROR from + * invalidating replication slots blocks VACUUM. + */ + InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0, + InvalidOid, InvalidTransactionId); + } +} + /* * Internal entry point for autovacuum and the VACUUM / ANALYZE commands. * @@ -498,7 +550,7 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg MemoryContext vac_context, bool isTopLevel) { static bool in_vacuum = false; - + static bool first_time = true; const char *stmttype; volatile bool in_outer_xact, use_own_xacts; @@ -611,6 +663,22 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg CommitTransactionCommand(); } + if (params.options & VACOPT_VACUUM) + { + if (first_time) + try_replication_slot_invalidation(); + + /* + * Every autovacuum worker will attempt to invalidate replication slots once. + * If a replication slot exceeds the age specified by max_slot_xid_age then + * it will only be invalidated once the next worker attempt kicks off. For + * manual VACUUM always attempt invalidation to account for long-lived + * maintenance connections. + */ + if (AmAutoVacuumWorkerProcess()) + first_time = false; + } + /* Turn vacuum cost accounting on or off, and set/clear in_vacuum */ PG_TRY(); { diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index fd0fdb96d42..b4fc0ba3702 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -116,6 +116,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -157,6 +158,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin greater + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1620,7 +1627,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin) { StringInfoData err_detail; StringInfoData err_hint; @@ -1665,6 +1674,26 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); + + if (TransactionIdIsValid(xmin)) + appendStringInfo(&err_detail, _("The slot's xmin %u exceeds the maximum xid age %d specified by \"max_slot_xid_age\"."), + xmin, + max_slot_xid_age); + else + appendStringInfo(&err_detail, _("The slot's catalog_xmin %u exceeds the maximum xid age %d specified by \"max_slot_xid_age\"."), + catalog_xmin, + max_slot_xid_age); + + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + + + break; + } case RS_INVAL_NONE: pg_unreachable(); } @@ -1783,6 +1812,13 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + if (possible_causes & RS_INVAL_XID_AGE) + { + /* Safe since we hold the replication slot's spinlock needed to avoid race conditions */ + if (ReplicationSlotIsXIDAged(s->data.xmin, s->data.catalog_xmin)) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1972,7 +2008,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, s->data.catalog_xmin); if (MyBackendType == B_STARTUP) (void) SendProcSignal(active_pid, @@ -2019,7 +2055,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, s->data.catalog_xmin); /* done with this slot for now */ break; @@ -2044,6 +2080,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -3093,3 +3131,39 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Check true if the given passed in xmin or catalog_xmin age is + * older than the age specified by max_slot_xid_age. + */ +bool +ReplicationSlotIsXIDAged(TransactionId xmin, TransactionId catalog_xmin) +{ + TransactionId cutoff; + TransactionId curr; + bool is_aged = false; + + if (max_slot_xid_age == 0) + return false; + + curr = ReadNextTransactionId(); + + /* + * Calculate oldest XID a slot's xmin or catalog_xmin can have before + * they are invalidated. + */ + cutoff = curr - max_slot_xid_age; + + if (!TransactionIdIsNormal(cutoff)) + cutoff = FirstNormalTransactionId; + + if (TransactionIdIsNormal(xmin) && + TransactionIdPrecedesOrEquals(xmin, cutoff)) + is_aged = true; + + if (TransactionIdIsNormal(catalog_xmin) && + TransactionIdPrecedesOrEquals(catalog_xmin, cutoff)) + is_aged = true; + + return is_aged; +} diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 6bc6be13d2a..c162268c314 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -1717,6 +1717,14 @@ max => 'INT_MAX', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # we have no microseconds designation, so can't supply units here { name => 'commit_delay', type => 'int', context => 'PGC_SUSET', group => 'WAL_SETTINGS', short_desc => 'Sets the delay in microseconds between transaction commit and flushing WAL to disk.', diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index fe62162cde3..3253f108ffc 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * On-Disk data of a replication slot, preserved across restarts. @@ -293,6 +295,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -350,4 +353,6 @@ extern bool SlotExistsInSyncStandbySlots(const char *slot_name); extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel); extern void WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn); +extern bool ReplicationSlotIsXIDAged(TransactionId xmin, TransactionId catalog_xmin); + #endif /* SLOT_H */ diff --git a/src/test/recovery/t/049_invalidate_xid_aged_slots.pl b/src/test/recovery/t/049_invalidate_xid_aged_slots.pl new file mode 100644 index 00000000000..f1e8f003f95 --- /dev/null +++ b/src/test/recovery/t/049_invalidate_xid_aged_slots.pl @@ -0,0 +1,240 @@ +# Copyright (c) 2025, PostgreSQL Global Development Group + +# Test for replication slots invalidation due to XID age +use strict; +use warnings FATAL => 'all'; + +use PostgreSQL::Test::BackgroundPsql; +use PostgreSQL::Test::Utils; +use PostgreSQL::Test::Cluster; +use Test::More; + +# Wait for slot to first become inactive and then get invalidated +sub wait_for_slot_invalidation +{ + my ($node, $slot_name, $reason) = @_; + my $name = $node->name; + + # Wait for the inactive replication slot to be invalidated + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + invalidation_reason = '$reason'; + ]) + or die + "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name"; +} + +# Do some work for advancing xids on a given node +sub advance_xids +{ + my ($node, $table_name) = @_; + + $node->safe_psql( + 'postgres', qq[ + do \$\$ + begin + for i in 10000..11000 loop + -- use an exception block so that each iteration eats an XID + begin + insert into $table_name values (i); + exception + when division_by_zero then null; + end; + end loop; + end\$\$; + ]); +} + +# ============================================================================= +# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. + +# Initialize primary node +my $primary = PostgreSQL::Test::Cluster->new('primary'); +$primary->init(allows_streaming => 'logical'); + +# Configure primary with XID age settings +$primary->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = 500 +}); + +$primary->start; + +# Take a backup for creating standby +my $backup_name = 'backup'; +$primary->backup($backup_name); + +# Create a standby linking to the primary using the replication slot +my $standby = PostgreSQL::Test::Cluster->new('standby'); +$standby->init_from_backup($primary, $backup_name, has_streaming => 1); + +# Enable hs_feedback. The slot should gain an xmin. We set the status interval +# so we'll see the results promptly. +$standby->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +max_standby_streaming_delay = 3600000 +}); + +$primary->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot', immediately_reserve := true); +]); + +$standby->start; + +# Create some content on primary to move xmin +$primary->safe_psql('postgres', + "CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a"); + +# Wait until standby has replayed enough data +$primary->wait_for_catchup($standby); + +$primary->poll_query_until( + 'postgres', qq[ + SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL) + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb_slot'; +]) or die "Timed out waiting for slot sb_slot xmin to advance"; + +# Stop standby to make the replication slot's xmin on primary to age + +# Read on standby that causes xmin to be held on slot +my $standby_session = $standby->interactive_psql('postgres'); +$standby_session->query("BEGIN; SET default_transaction_isolation = 'repeatable read'; SELECT * FROM tab_int;"); + +#$standby->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'tab_int'); + +# Wait for the replication slot to become inactive and then invalidated due to +# XID age. +$primary->safe_psql('postgres', "CHECKPOINT"); +wait_for_slot_invalidation($primary, 'sb_slot', 'xid_aged'); + +$standby_session->quit; +$standby->stop; + +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + +# ============================================================================= +# Testcase start: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. + +# Create a subscriber node +my $subscriber = PostgreSQL::Test::Cluster->new('subscriber'); +$subscriber->init(allows_streaming => 'logical'); +$subscriber->start; + +# Create tables on both primary and subscriber +$primary->safe_psql('postgres', "CREATE TABLE test_tbl (id int)"); +$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)"); + +# Insert some initial data +$primary->safe_psql('postgres', + "INSERT INTO test_tbl VALUES (generate_series(1, 5));"); + +# Setup logical replication +my $publisher_connstr = $primary->connstr . ' dbname=postgres'; +$primary->safe_psql('postgres', + "CREATE PUBLICATION pub FOR TABLE test_tbl"); + +$subscriber->safe_psql('postgres', + "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub_slot')" +); + +# Wait for initial sync to complete +$subscriber->wait_for_subscription_sync($primary, 'sub'); + +my $result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl"); +is($result, qq(5), "check initial copy was done for logical replication"); + +# Wait for the logical slot to get catalog_xmin (logical slots use catalog_xmin, not xmin) +$primary->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NULL AND catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub_slot'; +]) or die "Timed out waiting for slot lsub_slot catalog_xmin to advance"; + +# Stop subscriber to make the replication slot on primary inactive +$subscriber->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'test_tbl'); + +# Wait for the replication slot to become inactive and then invalidated due to +# XID age. +$primary->safe_psql('postgres', "CHECKPOINT"); +wait_for_slot_invalidation($primary, 'lsub_slot', 'xid_aged'); + +# Testcase end: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + +# ============================================================================= +# Testcase start: Test VACUUM command triggering slot invalidation +# + +# Create another physical replication slot for VACUUM test +$primary->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'vacuum_test_slot', immediately_reserve := true); +]); + +# Create a new standby for this test +my $standby_vacuum = PostgreSQL::Test::Cluster->new('standby_vacuum'); +$standby_vacuum->init_from_backup($primary, $backup_name, has_streaming => 1); + +$standby_vacuum->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'vacuum_test_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$standby_vacuum->start; + +# Wait until standby has replayed enough data and slot gets xmin +$primary->wait_for_catchup($standby_vacuum); + +$primary->poll_query_until( + 'postgres', qq[ + SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL) + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'vacuum_test_slot'; +]) or die "Timed out waiting for slot vacuum_test_slot xmin to advance"; + +# Stop standby to make the replication slot's xmin on primary to age +$standby_vacuum->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'tab_int'); + +# Use VACUUM to trigger slot invalidation (instead of CHECKPOINT) +# This tests that VACUUM command can trigger XID age invalidation +$primary->safe_psql('postgres', "VACUUM"); + +# Wait for the replication slot to become invalidated due to XID age triggered by VACUUM +$primary->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = 'vacuum_test_slot' AND + invalidation_reason = 'xid_aged'; +]) + or die "Timed out while waiting for slot vacuum_test_slot to be invalidated by VACUUM"; + +# Testcase end: Test VACUUM command triggering slot invalidation +# ============================================================================= + +ok(1, "all XID age invalidation tests completed successfully"); + +done_testing(); -- 2.50.1 ^ permalink raw reply [nested|flat] 31+ messages in thread
* RE: Introduce XID age based replication slot invalidation @ 2025-09-19 08:07 Hayato Kuroda (Fujitsu) <[email protected]> parent: John H <[email protected]> 1 sibling, 2 replies; 31+ messages in thread From: Hayato Kuroda (Fujitsu) @ 2025-09-19 08:07 UTC (permalink / raw) To: 'John H' <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; pgsql-hackers Dear John, > The first issue can be mitigated by 'max_slot_wal_keep_size'. However > in the second case there are no good mechanisms to prioritize write > availability of the database and avoid wraparound. The new GUC > 'idle_replication_slot_timeout' partially addresses the concern if you > have similar workloads. However it's hard to set the same setting > across a fleet of different applications. IIUC, the feature can directly avoid the wraparound issue than other invalidation mechanism. The motivation seems enough for me. > The patch currently attempts to invalidate once-per-autovacuum worker. > We're wondering if it should attempt invalidation on a per-relation > basis within the vacuum call itself. That would account for scenarios > where the cost_delay or naptime is high between autovac executions. I have a concern that age calculation acquire the lock for XidGenLock thus performance can be affected. Do you have insights for it? > > Invalidation happens in CHECKPOINT, similar to > 'idle_replication_slot_timeout', and when VACUUM occurs. Let me confirm because I'm new. VACUUM can also trigger because old XID make VACUUM fail, right? Timeout is aimed for WAL thus it is not so related with VACUUM, which does not recycle segments. In contrast, is there a possibility that XID-age check can be done only at VACUUM? Regarding the patch, try_replication_slot_invalidation() and ReplicationSlotIsXIDAged() do the same task. Can we reduce duplicated part? Best regards, Hayato Kuroda FUJITSU LIMITED ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2025-09-19 23:42 John H <[email protected]> parent: Hayato Kuroda (Fujitsu) <[email protected]> 1 sibling, 0 replies; 31+ messages in thread From: John H @ 2025-09-19 23:42 UTC (permalink / raw) To: Hayato Kuroda (Fujitsu) <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; pgsql-hackers Hi Hayato, Thank you for taking a look. > > The patch currently attempts to invalidate once-per-autovacuum worker. > > We're wondering if it should attempt invalidation on a per-relation > > basis within the vacuum call itself. That would account for scenarios > > where the cost_delay or naptime is high between autovac executions. > > I have a concern that age calculation acquire the lock for XidGenLock thus > performance can be affected. Do you have insights for it? Are you concerned if we did the check on a per table case? Or in the current situation where it's only once per-worker. > > > > Invalidation happens in CHECKPOINT, similar to > > 'idle_replication_slot_timeout', and when VACUUM occurs. > > Let me confirm because I'm new. VACUUM can also trigger because old XID make > VACUUM fail, right? Timeout is aimed for WAL thus it is not so related with VACUUM, > which does not recycle segments. > I feel that the timeout is used as a way to roughly address storage accumulation or VACUUM not progressing due to slots. > In contrast, is there a possibility that XID-age check can be done only at VACUUM? It's also done in CHECKPOINT because there can be stale replication slots on standby that aren't there on writer. We would still want them to be invalidated. > Regarding the patch, try_replication_slot_invalidation() and ReplicationSlotIsXIDAged() > do the same task. Can we reduce duplicated part? Thanks for catching, I thought I did this but guess not. Updated in the latest attachment. -- John Hsu - Amazon Web Services Attachments: [application/octet-stream] 0045-Add-XID-age-based-replication-slot-invalidation.patch (22.6K, 2-0045-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From b9c965a6459bfced99f59a60fef2897abe282159 Mon Sep 17 00:00:00 2001 From: John Hsu <[email protected]> Date: Fri, 8 Aug 2025 19:48:58 +0000 Subject: [PATCH] Add XID age based replication slot invalidation This commit introduces max_slot_xid_age GUC that allows replication slots whose xmin or catalog_xmin has reached the age specified by this setting to be invalidated. Idle or forgotten replication slots can hold back vacuum operations leading to bloat or transaction XID wrap around, requiring the slot to be dropped and requiring single user-mode vacuuming. This setting avoids these scenarios by proactively invalidating these stale slots on an XID basis. Invalidation checks happens at various locations to prevent wrap-around: - During CHECKPOINT - During vacuum (including autovacuum) This change makes it easy for administrators to protect against wrap-around concerns due to slots that are falling behind. Author: Bharath Rupireddy Author: John Hsu --- doc/src/sgml/config.sgml | 26 ++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/transam/xlog.c | 4 +- src/backend/commands/vacuum.c | 45 +++- src/backend/replication/slot.c | 80 +++++- src/backend/utils/misc/guc_parameters.dat | 8 + src/include/replication/slot.h | 7 +- .../t/049_invalidate_xid_aged_slots.pl | 240 ++++++++++++++++++ 8 files changed, 411 insertions(+), 7 deletions(-) create mode 100644 src/test/recovery/t/049_invalidate_xid_aged_slots.pl diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e9b420f3ddb..eb07bcd3551 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4653,6 +4653,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <literal>xmin</literal> (the oldest + transaction that this slot needs the database to retain) or + <literal>catalog_xmin</literal> (the oldest transaction affecting the + system catalogs that this slot needs the database to retain) has reached + the age specified by this setting. A value of zero (which is default) + disables this feature. Users can set this value anywhere from zero to + two billion. This parameter can only be set in the + <filename>postgresql.conf</filename> file or on the server command + line. + </para> + + <para> + This invalidation check happens either when the slot is acquired + for use or during vacuum or during checkpoint. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 4187191ea74..beeb22a7da4 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3007,6 +3007,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 0baf0ac6160..41a48389afb 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7347,7 +7347,7 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | RS_INVAL_XID_AGE, _logSegNo, InvalidOid, InvalidTransactionId)) { @@ -7801,7 +7801,7 @@ CreateRestartPoint(int flags) replayPtr = GetXLogReplayRecPtr(&replayTLI); endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr; KeepLogSeg(endptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | RS_INVAL_XID_AGE, _logSegNo, InvalidOid, InvalidTransactionId)) { diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 733ef40ae7c..28be9225501 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -47,6 +47,7 @@ #include "postmaster/autovacuum.h" #include "postmaster/bgworker_internals.h" #include "postmaster/interrupt.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/lmgr.h" #include "storage/pmsignal.h" @@ -129,6 +130,7 @@ static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams params, static double compute_parallel_delay(void); static VacOptValue get_vacoptval_from_boolean(DefElem *def); static bool vac_tid_reaped(ItemPointer itemptr, void *state); +static void try_replication_slot_invalidation(void); /* * GUC check function to ensure GUC value specified is within the allowable @@ -471,6 +473,31 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) MemoryContextDelete(vac_context); } +/* + * Try invalidating replication slots based on current replication slot xmin + * limits once every vacuum cycle. + */ +static void +try_replication_slot_invalidation(void) +{ + TransactionId min_slot_xmin = InvalidTransactionId; + TransactionId min_slot_catalog_xmin = InvalidTransactionId; + + ProcArrayGetReplicationSlotXmin(&min_slot_xmin, &min_slot_catalog_xmin); + + if (ReplicationSlotIsXIDAged(min_slot_xmin, min_slot_catalog_xmin)) + { + /* + * Note that InvalidateObsoleteReplicationSlots is also called as part + * of CHECKPOINT, and emitting ERRORs from within is avoided already. + * Therefore, there is no concern here that any ERROR from + * invalidating replication slots blocks VACUUM. + */ + InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0, + InvalidOid, InvalidTransactionId); + } +} + /* * Internal entry point for autovacuum and the VACUUM / ANALYZE commands. * @@ -498,7 +525,7 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg MemoryContext vac_context, bool isTopLevel) { static bool in_vacuum = false; - + static bool first_time = true; const char *stmttype; volatile bool in_outer_xact, use_own_xacts; @@ -611,6 +638,22 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg CommitTransactionCommand(); } + if (params.options & VACOPT_VACUUM) + { + if (first_time) + try_replication_slot_invalidation(); + + /* + * Every autovacuum worker will attempt to invalidate replication slots once. + * If a replication slot exceeds the age specified by max_slot_xid_age then + * it will only be invalidated once the next worker attempt kicks off. For + * manual VACUUM always attempt invalidation to account for long-lived + * maintenance connections. + */ + if (AmAutoVacuumWorkerProcess()) + first_time = false; + } + /* Turn vacuum cost accounting on or off, and set/clear in_vacuum */ PG_TRY(); { diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index fd0fdb96d42..b4fc0ba3702 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -116,6 +116,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -157,6 +158,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin greater + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1620,7 +1627,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin) { StringInfoData err_detail; StringInfoData err_hint; @@ -1665,6 +1674,26 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); + + if (TransactionIdIsValid(xmin)) + appendStringInfo(&err_detail, _("The slot's xmin %u exceeds the maximum xid age %d specified by \"max_slot_xid_age\"."), + xmin, + max_slot_xid_age); + else + appendStringInfo(&err_detail, _("The slot's catalog_xmin %u exceeds the maximum xid age %d specified by \"max_slot_xid_age\"."), + catalog_xmin, + max_slot_xid_age); + + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + + + break; + } case RS_INVAL_NONE: pg_unreachable(); } @@ -1783,6 +1812,13 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + if (possible_causes & RS_INVAL_XID_AGE) + { + /* Safe since we hold the replication slot's spinlock needed to avoid race conditions */ + if (ReplicationSlotIsXIDAged(s->data.xmin, s->data.catalog_xmin)) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1972,7 +2008,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, s->data.catalog_xmin); if (MyBackendType == B_STARTUP) (void) SendProcSignal(active_pid, @@ -2019,7 +2055,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, s->data.catalog_xmin); /* done with this slot for now */ break; @@ -2044,6 +2080,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -3093,3 +3131,39 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Check true if the given passed in xmin or catalog_xmin age is + * older than the age specified by max_slot_xid_age. + */ +bool +ReplicationSlotIsXIDAged(TransactionId xmin, TransactionId catalog_xmin) +{ + TransactionId cutoff; + TransactionId curr; + bool is_aged = false; + + if (max_slot_xid_age == 0) + return false; + + curr = ReadNextTransactionId(); + + /* + * Calculate oldest XID a slot's xmin or catalog_xmin can have before + * they are invalidated. + */ + cutoff = curr - max_slot_xid_age; + + if (!TransactionIdIsNormal(cutoff)) + cutoff = FirstNormalTransactionId; + + if (TransactionIdIsNormal(xmin) && + TransactionIdPrecedesOrEquals(xmin, cutoff)) + is_aged = true; + + if (TransactionIdIsNormal(catalog_xmin) && + TransactionIdPrecedesOrEquals(catalog_xmin, cutoff)) + is_aged = true; + + return is_aged; +} diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 6bc6be13d2a..c162268c314 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -1717,6 +1717,14 @@ max => 'INT_MAX', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # we have no microseconds designation, so can't supply units here { name => 'commit_delay', type => 'int', context => 'PGC_SUSET', group => 'WAL_SETTINGS', short_desc => 'Sets the delay in microseconds between transaction commit and flushing WAL to disk.', diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index fe62162cde3..3253f108ffc 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * On-Disk data of a replication slot, preserved across restarts. @@ -293,6 +295,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -350,4 +353,6 @@ extern bool SlotExistsInSyncStandbySlots(const char *slot_name); extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel); extern void WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn); +extern bool ReplicationSlotIsXIDAged(TransactionId xmin, TransactionId catalog_xmin); + #endif /* SLOT_H */ diff --git a/src/test/recovery/t/049_invalidate_xid_aged_slots.pl b/src/test/recovery/t/049_invalidate_xid_aged_slots.pl new file mode 100644 index 00000000000..f1e8f003f95 --- /dev/null +++ b/src/test/recovery/t/049_invalidate_xid_aged_slots.pl @@ -0,0 +1,240 @@ +# Copyright (c) 2025, PostgreSQL Global Development Group + +# Test for replication slots invalidation due to XID age +use strict; +use warnings FATAL => 'all'; + +use PostgreSQL::Test::BackgroundPsql; +use PostgreSQL::Test::Utils; +use PostgreSQL::Test::Cluster; +use Test::More; + +# Wait for slot to first become inactive and then get invalidated +sub wait_for_slot_invalidation +{ + my ($node, $slot_name, $reason) = @_; + my $name = $node->name; + + # Wait for the inactive replication slot to be invalidated + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + invalidation_reason = '$reason'; + ]) + or die + "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name"; +} + +# Do some work for advancing xids on a given node +sub advance_xids +{ + my ($node, $table_name) = @_; + + $node->safe_psql( + 'postgres', qq[ + do \$\$ + begin + for i in 10000..11000 loop + -- use an exception block so that each iteration eats an XID + begin + insert into $table_name values (i); + exception + when division_by_zero then null; + end; + end loop; + end\$\$; + ]); +} + +# ============================================================================= +# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. + +# Initialize primary node +my $primary = PostgreSQL::Test::Cluster->new('primary'); +$primary->init(allows_streaming => 'logical'); + +# Configure primary with XID age settings +$primary->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = 500 +}); + +$primary->start; + +# Take a backup for creating standby +my $backup_name = 'backup'; +$primary->backup($backup_name); + +# Create a standby linking to the primary using the replication slot +my $standby = PostgreSQL::Test::Cluster->new('standby'); +$standby->init_from_backup($primary, $backup_name, has_streaming => 1); + +# Enable hs_feedback. The slot should gain an xmin. We set the status interval +# so we'll see the results promptly. +$standby->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +max_standby_streaming_delay = 3600000 +}); + +$primary->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot', immediately_reserve := true); +]); + +$standby->start; + +# Create some content on primary to move xmin +$primary->safe_psql('postgres', + "CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a"); + +# Wait until standby has replayed enough data +$primary->wait_for_catchup($standby); + +$primary->poll_query_until( + 'postgres', qq[ + SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL) + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb_slot'; +]) or die "Timed out waiting for slot sb_slot xmin to advance"; + +# Stop standby to make the replication slot's xmin on primary to age + +# Read on standby that causes xmin to be held on slot +my $standby_session = $standby->interactive_psql('postgres'); +$standby_session->query("BEGIN; SET default_transaction_isolation = 'repeatable read'; SELECT * FROM tab_int;"); + +#$standby->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'tab_int'); + +# Wait for the replication slot to become inactive and then invalidated due to +# XID age. +$primary->safe_psql('postgres', "CHECKPOINT"); +wait_for_slot_invalidation($primary, 'sb_slot', 'xid_aged'); + +$standby_session->quit; +$standby->stop; + +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + +# ============================================================================= +# Testcase start: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. + +# Create a subscriber node +my $subscriber = PostgreSQL::Test::Cluster->new('subscriber'); +$subscriber->init(allows_streaming => 'logical'); +$subscriber->start; + +# Create tables on both primary and subscriber +$primary->safe_psql('postgres', "CREATE TABLE test_tbl (id int)"); +$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)"); + +# Insert some initial data +$primary->safe_psql('postgres', + "INSERT INTO test_tbl VALUES (generate_series(1, 5));"); + +# Setup logical replication +my $publisher_connstr = $primary->connstr . ' dbname=postgres'; +$primary->safe_psql('postgres', + "CREATE PUBLICATION pub FOR TABLE test_tbl"); + +$subscriber->safe_psql('postgres', + "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub_slot')" +); + +# Wait for initial sync to complete +$subscriber->wait_for_subscription_sync($primary, 'sub'); + +my $result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl"); +is($result, qq(5), "check initial copy was done for logical replication"); + +# Wait for the logical slot to get catalog_xmin (logical slots use catalog_xmin, not xmin) +$primary->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NULL AND catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub_slot'; +]) or die "Timed out waiting for slot lsub_slot catalog_xmin to advance"; + +# Stop subscriber to make the replication slot on primary inactive +$subscriber->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'test_tbl'); + +# Wait for the replication slot to become inactive and then invalidated due to +# XID age. +$primary->safe_psql('postgres', "CHECKPOINT"); +wait_for_slot_invalidation($primary, 'lsub_slot', 'xid_aged'); + +# Testcase end: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + +# ============================================================================= +# Testcase start: Test VACUUM command triggering slot invalidation +# + +# Create another physical replication slot for VACUUM test +$primary->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'vacuum_test_slot', immediately_reserve := true); +]); + +# Create a new standby for this test +my $standby_vacuum = PostgreSQL::Test::Cluster->new('standby_vacuum'); +$standby_vacuum->init_from_backup($primary, $backup_name, has_streaming => 1); + +$standby_vacuum->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'vacuum_test_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$standby_vacuum->start; + +# Wait until standby has replayed enough data and slot gets xmin +$primary->wait_for_catchup($standby_vacuum); + +$primary->poll_query_until( + 'postgres', qq[ + SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL) + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'vacuum_test_slot'; +]) or die "Timed out waiting for slot vacuum_test_slot xmin to advance"; + +# Stop standby to make the replication slot's xmin on primary to age +$standby_vacuum->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'tab_int'); + +# Use VACUUM to trigger slot invalidation (instead of CHECKPOINT) +# This tests that VACUUM command can trigger XID age invalidation +$primary->safe_psql('postgres', "VACUUM"); + +# Wait for the replication slot to become invalidated due to XID age triggered by VACUUM +$primary->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = 'vacuum_test_slot' AND + invalidation_reason = 'xid_aged'; +]) + or die "Timed out while waiting for slot vacuum_test_slot to be invalidated by VACUUM"; + +# Testcase end: Test VACUUM command triggering slot invalidation +# ============================================================================= + +ok(1, "all XID age invalidation tests completed successfully"); + +done_testing(); -- 2.50.1 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2025-09-25 00:18 Bharath Rupireddy <[email protected]> parent: John H <[email protected]> 1 sibling, 0 replies; 31+ messages in thread From: Bharath Rupireddy @ 2025-09-25 00:18 UTC (permalink / raw) To: John H <[email protected]>; +Cc: pgsql-hackers Hi, On Thu, Sep 18, 2025 at 10:20 AM John H <[email protected]> wrote: > > I'd like to restart the discussion about providing an xid-based slot > invalidation mechanism. The previous effort [1] presented an XID and > time-based invalidation and the inactive time-based approach was > implemented first. The latest XID based patch from Bharath Rupireddy > can be found here [2]. > > When thinking about availability of the database, inactive replication > slots cause two main pain points: > 1) WAL accumulation > 2) Replication slots with xmin/catalog_xmin can hold back vacuuming > leading to wrap-around > > It's easy to imagine a high-XID churning workload in one cluster while > another has large batch jobs where changes get synced out > periodically. There isn't a "one-size" fits all setting for > 'idle_replication_slot_timeout' in these two cases. +1. > The attached patch addresses this by introducing 'max_slot_xid_age' in > a similar fashion. Replication slots with transaction ID greater than > the set age will get invalidated allowing vacuum to proceed, biasing > towards database availability. > > Invalidation happens in CHECKPOINT, similar to > 'idle_replication_slot_timeout', and when VACUUM occurs. > > The patch currently attempts to invalidate once-per-autovacuum worker. > We're wondering if it should attempt invalidation on a per-relation > basis within the vacuum call itself. That would account for scenarios > where the cost_delay or naptime is high between autovac executions. IMO, computing XID horizons per-relation during vacuum is good. The main reason we try to invalidate replication slots based on the XID age in the vacuum path is to help the database when it needs it most - when vacuum is computing the XID horizons. That said, it would be good to have performance analysis with a large number of replication slots, comparing once-per-relation vs. once-per-autovacuum worker vs. once-per-autovacuum launcher wake-up cycle. I haven't looked at the patch in depth, but it would be good to have a TAP test with more realistic production workloads. We could set this value to less than 1.5 billion and use xid_wraparound test to quickly reach the wraparound limits, then verify if this setting can help prevent the database from reaching wraparound errors. This approach would also validate the age calculations in try_replication_slot_invalidation with higher limits. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-20 16:10 Bharath Rupireddy <[email protected]> parent: Hayato Kuroda (Fujitsu) <[email protected]> 1 sibling, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-03-20 16:10 UTC (permalink / raw) To: Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; +Cc: pgsql-hackers Hi John, Thank you for sending in the rebased patch earlier. I will have some cycles going forward and I would like to continue with this work. Hi Kuroda-san, Thank you for reviewing the patch. On Fri, Sep 19, 2025 at 1:07 AM Hayato Kuroda (Fujitsu) <[email protected]> wrote: > > IIUC, the feature can directly avoid the wraparound issue than other > invalidation mechanism. The motivation seems enough for me. That's correct. When enabled, replication slots whose XID age exceeds the configured value get invalidated before vacuum computes the XID horizons. This ensures that slots which would otherwise prevent vacuum from freezing heap tuples don't come in the way of XID wraparound prevention. > The patch currently attempts to invalidate once-per-autovacuum worker. > We're wondering if it should attempt invalidation on a per-relation > basis within the vacuum call itself. That would account for scenarios > where the cost_delay or naptime is high between autovac executions. > > I have a concern that age calculation acquire the lock for XidGenLock thus > performance can be affected. Do you have insights for it? I made the following design choice: try invalidating only once per vacuum cycle, not per table. While this keeps the cost of checking (incl. the XidGenLock contention) for invalidation to a minimum when there are a large number of tables and replication slots, it can be less effective when individual tables/indexes are large. Invalidating during checkpoints can help to some extent with the large table/index cases. But I'm open to thoughts on this. Please find the attached patch for further review. I fixed the XID age calculation in ReplicationSlotIsXIDAged and adjusted the code comments. -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/x-patch] v1-0001-Add-XID-age-based-replication-slot-invalidation.patch (23.5K, 2-v1-0001-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From 96bcbc3d3baff0b801d59823c609949fa520813d Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Fri, 20 Mar 2026 06:41:24 +0000 Subject: [PATCH v1] Add XID age based replication slot invalidation Introduce max_slot_xid_age, a GUC that invalidates replication slots whose xmin or catalog_xmin exceeds the specified age. Disabled by default. Idle or forgotten replication slots can hold back vacuum, leading to bloat and eventually XID wraparound. In the worst case this requires dropping the slot and single-user mode vacuuming. This setting avoids that by proactively invalidating slots that have fallen too far behind. Invalidation checks are performed during checkpoint and vacuum (including autovacuum). --- doc/src/sgml/config.sgml | 26 ++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/transam/xlog.c | 4 +- src/backend/commands/vacuum.c | 53 +++- src/backend/replication/slot.c | 83 +++++- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/replication/slot.h | 7 +- .../t/049_invalidate_xid_aged_slots.pl | 240 ++++++++++++++++++ 9 files changed, 424 insertions(+), 7 deletions(-) create mode 100644 src/test/recovery/t/049_invalidate_xid_aged_slots.pl diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 8cdd826fbd3..0cbcb254300 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <literal>xmin</literal> (the oldest + transaction that this slot needs the database to retain) or + <literal>catalog_xmin</literal> (the oldest transaction affecting the + system catalogs that this slot needs the database to retain) has reached + the age specified by this setting. A value of zero (which is default) + disables this feature. Users can set this value anywhere from zero to + two billion. This parameter can only be set in the + <filename>postgresql.conf</filename> file or on the server command + line. + </para> + + <para> + This invalidation check happens either when the slot is acquired + for use or during vacuum or during checkpoint. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f5c9a34374d..a87aa9c2bea 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7441,7 +7441,7 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | RS_INVAL_XID_AGE, _logSegNo, InvalidOid, InvalidTransactionId)) { @@ -7898,7 +7898,7 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | RS_INVAL_XID_AGE, _logSegNo, InvalidOid, InvalidTransactionId)) { diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index bce3a2daa24..cb2d8272e8b 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -48,6 +48,7 @@ #include "postmaster/autovacuum.h" #include "postmaster/bgworker_internals.h" #include "postmaster/interrupt.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/lmgr.h" #include "storage/pmsignal.h" @@ -131,6 +132,7 @@ static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams params, static double compute_parallel_delay(void); static VacOptValue get_vacoptval_from_boolean(DefElem *def); static bool vac_tid_reaped(ItemPointer itemptr, void *state); +static void try_replication_slot_invalidation(void); /* * GUC check function to ensure GUC value specified is within the allowable @@ -468,6 +470,34 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) MemoryContextDelete(vac_context); } +/* + * Try invalidating replication slots based on current replication slot xmin + * limits once every vacuum cycle. + */ +static void +try_replication_slot_invalidation(void) +{ + TransactionId min_slot_xmin = InvalidTransactionId; + TransactionId min_slot_catalog_xmin = InvalidTransactionId; + + if (max_slot_xid_age == 0) + return; + + ProcArrayGetReplicationSlotXmin(&min_slot_xmin, &min_slot_catalog_xmin); + + if (ReplicationSlotIsXIDAged(min_slot_xmin, min_slot_catalog_xmin)) + { + /* + * Note that InvalidateObsoleteReplicationSlots is also called as part + * of CHECKPOINT, and emitting ERRORs from within is avoided already. + * Therefore, there is no concern here that any ERROR from + * invalidating replication slots blocks VACUUM. + */ + InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0, + InvalidOid, InvalidTransactionId); + } +} + /* * Internal entry point for autovacuum and the VACUUM / ANALYZE commands. * @@ -495,7 +525,7 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg MemoryContext vac_context, bool isTopLevel) { static bool in_vacuum = false; - + static bool first_time = true; const char *stmttype; volatile bool in_outer_xact, use_own_xacts; @@ -608,6 +638,27 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg CommitTransactionCommand(); } + if (params.options & VACOPT_VACUUM) + { + /* + * Check once per vacuum cycle, not per-relation, because replication + * slot xmin age is a global property that doesn't change based on + * which table we vacuum. Checkpoints also check, so slots won't go + * undetected for long even if vacuum doesn't run. + * + * Each autovacuum worker checks only once, on its first vacuum() + * call; subsequent calls within the same worker skip the check. The + * next newly launched autovacuum worker will check again. Manual + * VACUUM always checks, since the same backend may issue VACUUM + * repeatedly across its lifetime. + */ + if (first_time) + try_replication_slot_invalidation(); + + if (AmAutoVacuumWorkerProcess()) + first_time = false; + } + /* Turn vacuum cost accounting on or off, and set/clear in_vacuum */ PG_TRY(); { diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..e1e0282f106 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin greater + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1780,7 +1787,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1834,24 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); + + if (TransactionIdIsValid(xmin)) + appendStringInfo(&err_detail, _("The slot's xmin %u exceeds the maximum xid age %d specified by \"max_slot_xid_age\"."), + xmin, + max_slot_xid_age); + else + appendStringInfo(&err_detail, _("The slot's catalog_xmin %u exceeds the maximum xid age %d specified by \"max_slot_xid_age\"."), + catalog_xmin, + max_slot_xid_age); + + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } case RS_INVAL_NONE: pg_unreachable(); } @@ -1945,6 +1972,16 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + if (possible_causes & RS_INVAL_XID_AGE) + { + /* + * Safe since we hold the replication slot's spinlock needed to avoid + * race conditions + */ + if (ReplicationSlotIsXIDAged(s->data.xmin, s->data.catalog_xmin)) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -2112,7 +2149,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, s->data.catalog_xmin); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2202,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, s->data.catalog_xmin); /* done with this slot for now */ break; @@ -2192,6 +2229,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -3275,3 +3314,41 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Check true if the given passed in xmin or catalog_xmin age is + * older than the age specified by max_slot_xid_age. + */ +bool +ReplicationSlotIsXIDAged(TransactionId xmin, TransactionId catalog_xmin) +{ + TransactionId cutoff; + TransactionId curr; + bool is_aged = false; + + if (max_slot_xid_age == 0) + return false; + + curr = ReadNextTransactionId(); + + /* + * Calculate oldest XID a slot's xmin or catalog_xmin can have before they + * are invalidated. + */ + cutoff = curr - max_slot_xid_age; + + /* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */ + /* this can cause the limit to go backwards by 3, but that's OK */ + if (cutoff < FirstNormalTransactionId) + cutoff -= FirstNormalTransactionId; + + if (TransactionIdIsNormal(xmin) && + TransactionIdPrecedesOrEquals(xmin, cutoff)) + is_aged = true; + + if (TransactionIdIsNormal(catalog_xmin) && + TransactionIdPrecedesOrEquals(catalog_xmin, cutoff)) + is_aged = true; + + return is_aged; +} diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 0c9854ad8fc..7fab02f1eeb 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2049,6 +2049,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index e4abe6c0077..0f728d87b6c 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..dce3f0f51b0 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -387,4 +390,6 @@ extern bool SlotExistsInSyncStandbySlots(const char *slot_name); extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel); extern void WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn); +extern bool ReplicationSlotIsXIDAged(TransactionId xmin, TransactionId catalog_xmin); + #endif /* SLOT_H */ diff --git a/src/test/recovery/t/049_invalidate_xid_aged_slots.pl b/src/test/recovery/t/049_invalidate_xid_aged_slots.pl new file mode 100644 index 00000000000..f1e8f003f95 --- /dev/null +++ b/src/test/recovery/t/049_invalidate_xid_aged_slots.pl @@ -0,0 +1,240 @@ +# Copyright (c) 2025, PostgreSQL Global Development Group + +# Test for replication slots invalidation due to XID age +use strict; +use warnings FATAL => 'all'; + +use PostgreSQL::Test::BackgroundPsql; +use PostgreSQL::Test::Utils; +use PostgreSQL::Test::Cluster; +use Test::More; + +# Wait for slot to first become inactive and then get invalidated +sub wait_for_slot_invalidation +{ + my ($node, $slot_name, $reason) = @_; + my $name = $node->name; + + # Wait for the inactive replication slot to be invalidated + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + invalidation_reason = '$reason'; + ]) + or die + "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name"; +} + +# Do some work for advancing xids on a given node +sub advance_xids +{ + my ($node, $table_name) = @_; + + $node->safe_psql( + 'postgres', qq[ + do \$\$ + begin + for i in 10000..11000 loop + -- use an exception block so that each iteration eats an XID + begin + insert into $table_name values (i); + exception + when division_by_zero then null; + end; + end loop; + end\$\$; + ]); +} + +# ============================================================================= +# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. + +# Initialize primary node +my $primary = PostgreSQL::Test::Cluster->new('primary'); +$primary->init(allows_streaming => 'logical'); + +# Configure primary with XID age settings +$primary->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = 500 +}); + +$primary->start; + +# Take a backup for creating standby +my $backup_name = 'backup'; +$primary->backup($backup_name); + +# Create a standby linking to the primary using the replication slot +my $standby = PostgreSQL::Test::Cluster->new('standby'); +$standby->init_from_backup($primary, $backup_name, has_streaming => 1); + +# Enable hs_feedback. The slot should gain an xmin. We set the status interval +# so we'll see the results promptly. +$standby->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +max_standby_streaming_delay = 3600000 +}); + +$primary->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot', immediately_reserve := true); +]); + +$standby->start; + +# Create some content on primary to move xmin +$primary->safe_psql('postgres', + "CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a"); + +# Wait until standby has replayed enough data +$primary->wait_for_catchup($standby); + +$primary->poll_query_until( + 'postgres', qq[ + SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL) + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb_slot'; +]) or die "Timed out waiting for slot sb_slot xmin to advance"; + +# Stop standby to make the replication slot's xmin on primary to age + +# Read on standby that causes xmin to be held on slot +my $standby_session = $standby->interactive_psql('postgres'); +$standby_session->query("BEGIN; SET default_transaction_isolation = 'repeatable read'; SELECT * FROM tab_int;"); + +#$standby->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'tab_int'); + +# Wait for the replication slot to become inactive and then invalidated due to +# XID age. +$primary->safe_psql('postgres', "CHECKPOINT"); +wait_for_slot_invalidation($primary, 'sb_slot', 'xid_aged'); + +$standby_session->quit; +$standby->stop; + +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + +# ============================================================================= +# Testcase start: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. + +# Create a subscriber node +my $subscriber = PostgreSQL::Test::Cluster->new('subscriber'); +$subscriber->init(allows_streaming => 'logical'); +$subscriber->start; + +# Create tables on both primary and subscriber +$primary->safe_psql('postgres', "CREATE TABLE test_tbl (id int)"); +$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)"); + +# Insert some initial data +$primary->safe_psql('postgres', + "INSERT INTO test_tbl VALUES (generate_series(1, 5));"); + +# Setup logical replication +my $publisher_connstr = $primary->connstr . ' dbname=postgres'; +$primary->safe_psql('postgres', + "CREATE PUBLICATION pub FOR TABLE test_tbl"); + +$subscriber->safe_psql('postgres', + "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub_slot')" +); + +# Wait for initial sync to complete +$subscriber->wait_for_subscription_sync($primary, 'sub'); + +my $result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl"); +is($result, qq(5), "check initial copy was done for logical replication"); + +# Wait for the logical slot to get catalog_xmin (logical slots use catalog_xmin, not xmin) +$primary->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NULL AND catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub_slot'; +]) or die "Timed out waiting for slot lsub_slot catalog_xmin to advance"; + +# Stop subscriber to make the replication slot on primary inactive +$subscriber->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'test_tbl'); + +# Wait for the replication slot to become inactive and then invalidated due to +# XID age. +$primary->safe_psql('postgres', "CHECKPOINT"); +wait_for_slot_invalidation($primary, 'lsub_slot', 'xid_aged'); + +# Testcase end: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + +# ============================================================================= +# Testcase start: Test VACUUM command triggering slot invalidation +# + +# Create another physical replication slot for VACUUM test +$primary->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'vacuum_test_slot', immediately_reserve := true); +]); + +# Create a new standby for this test +my $standby_vacuum = PostgreSQL::Test::Cluster->new('standby_vacuum'); +$standby_vacuum->init_from_backup($primary, $backup_name, has_streaming => 1); + +$standby_vacuum->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'vacuum_test_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$standby_vacuum->start; + +# Wait until standby has replayed enough data and slot gets xmin +$primary->wait_for_catchup($standby_vacuum); + +$primary->poll_query_until( + 'postgres', qq[ + SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL) + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'vacuum_test_slot'; +]) or die "Timed out waiting for slot vacuum_test_slot xmin to advance"; + +# Stop standby to make the replication slot's xmin on primary to age +$standby_vacuum->stop; + +# Do some work to advance xids on primary +advance_xids($primary, 'tab_int'); + +# Use VACUUM to trigger slot invalidation (instead of CHECKPOINT) +# This tests that VACUUM command can trigger XID age invalidation +$primary->safe_psql('postgres', "VACUUM"); + +# Wait for the replication slot to become invalidated due to XID age triggered by VACUUM +$primary->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = 'vacuum_test_slot' AND + invalidation_reason = 'xid_aged'; +]) + or die "Timed out while waiting for slot vacuum_test_slot to be invalidated by VACUUM"; + +# Testcase end: Test VACUUM command triggering slot invalidation +# ============================================================================= + +ok(1, "all XID age invalidation tests completed successfully"); + +done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-21 06:28 SATYANARAYANA NARLAPURAM <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: SATYANARAYANA NARLAPURAM @ 2026-03-21 06:28 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi Bharath, Do you think we need different GUCs for catalog_xmin and xmin? If table bloat is a concern (not catalog bloat), then logical slots are not required to invalidate unless the cluster is close to wraparound. > I made the following design choice: try invalidating only once per > vacuum cycle, not per table. While this keeps the cost of checking > (incl. the XidGenLock contention) for invalidation to a minimum when > there are a large number of tables and replication slots, it can be > less effective when individual tables/indexes are large. Invalidating > during checkpoints can help to some extent with the large table/index > cases. But I'm open to thoughts on this. > It may not solve the intent when the vacuum cycle is longer, which one can expect on a large database particularly when there is heavy bloat. > Please find the attached patch for further review. I fixed the XID age > calculation in ReplicationSlotIsXIDAged and adjusted the code > comments. > I applied the patch and all the tests passed. A few comments: @@ -495,7 +525,7 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg MemoryContext vac_context, bool isTopLevel) { static bool in_vacuum = false; - + static bool first_time = true; first_time variable is not self explanatory, maybe something like try_replication_slot_invalidation and add comments that it will be set to false after the first check? + if (TransactionIdIsValid(xmin)) + appendStringInfo(&err_detail, _("The slot's xmin %u exceeds the maximum xid age %d specified by \"max_slot_xid_age\"."), + xmin, + max_slot_xid_age); Slot invalidates even when the age is max_slot_xid_age, isn't it? Thanks, Satya ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-23 16:00 Bharath Rupireddy <[email protected]> parent: SATYANARAYANA NARLAPURAM <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-03-23 16:00 UTC (permalink / raw) To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers --00000000000052d24c064db323a8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, On Fri, Mar 20, 2026 at 11:29=E2=80=AFPM SATYANARAYANA NARLAPURAM <[email protected]> wrote: > > Do you think we need different GUCs for catalog_xmin and xmin? If table b= loat is a concern (not catalog bloat), then logical slots are not required = to invalidate unless the cluster is close to wraparound. IMO the main purpose of max_slot_xid_age is to prevent XID wraparound. For bloat, I still think max_slot_wal_keep_size is the better choice. Where max_slot_xid_age is really useful is when the vacuum can't freeze because a replication slot (physical or logical) is holding back the XID horizon and the system is getting close to wraparound. Invalidating such a slot clears the way for vacuum. Setting max_slot_xid_age above vacuum_failsafe_age allows vacuum to waste cycles scanning tables it cannot freeze. Keeping max_slot_xid_age <=3D vacuum_failsafe_age (default 1.6B) prevents this by invalidating the slot before vacuum effort is wasted. As far as XID wraparound is concerned, both xmin and catalog_xmin need to be treated similarly. Either one can hold back freezing and push the system toward wraparound. So I don't think we need separate GUCs for xmin and catalog_xmin unless I'm missing something. One GUC covering both keeps things simple. >> I made the following design choice: try invalidating only once per >> vacuum cycle, not per table. While this keeps the cost of checking >> (incl. the XidGenLock contention) for invalidation to a minimum when >> there are a large number of tables and replication slots, it can be >> less effective when individual tables/indexes are large. Invalidating >> during checkpoints can help to some extent with the large table/index >> cases. But I'm open to thoughts on this. > > It may not solve the intent when the vacuum cycle is longer, which one ca= n expect on a large database particularly when there is heavy bloat. This design choice boils down to the following: a database instance having either 1/ a large number of small tables or 2/ large tables. ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-23 23:36 Masahiko Sawada <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Masahiko Sawada @ 2026-03-23 23:36 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Mon, Mar 23, 2026 at 9:00 AM Bharath Rupireddy <[email protected]> wrote: > > Hi, > > On Fri, Mar 20, 2026 at 11:29 PM SATYANARAYANA NARLAPURAM > <[email protected]> wrote: > > > > Do you think we need different GUCs for catalog_xmin and xmin? If table bloat is a concern (not catalog bloat), then logical slots are not required to invalidate unless the cluster is close to wraparound. > > IMO the main purpose of max_slot_xid_age is to prevent XID wraparound. > For bloat, I still think max_slot_wal_keep_size is the better choice. > > Where max_slot_xid_age is really useful is when the vacuum can't > freeze because a replication slot (physical or logical) is holding > back the XID horizon and the system is getting close to wraparound. > Invalidating such a slot clears the way for vacuum. Setting > max_slot_xid_age above vacuum_failsafe_age allows vacuum to waste > cycles scanning tables it cannot freeze. Keeping max_slot_xid_age <= > vacuum_failsafe_age (default 1.6B) prevents this by invalidating the > slot before vacuum effort is wasted. > > As far as XID wraparound is concerned, both xmin and catalog_xmin need > to be treated similarly. Either one can hold back freezing and push > the system toward wraparound. So I don't think we need separate GUCs > for xmin and catalog_xmin unless I'm missing something. One GUC > covering both keeps things simple. I've studied the discussion on this thread and the patch. I understand the purpose of this feature and agree that it's useful especially in cases where orphaned (physical or logical) replication slots prevent the xmin from advancing and inactive_since based slot invalidation might not fit. And +1 for treating both the slot's xmin and catalog_xmin similarly with the single GUC. > >> I made the following design choice: try invalidating only once per > >> vacuum cycle, not per table. While this keeps the cost of checking > >> (incl. the XidGenLock contention) for invalidation to a minimum when > >> there are a large number of tables and replication slots, it can be > >> less effective when individual tables/indexes are large. Invalidating > >> during checkpoints can help to some extent with the large table/index > >> cases. But I'm open to thoughts on this. > > > > It may not solve the intent when the vacuum cycle is longer, which one can expect on a large database particularly when there is heavy bloat. > > This design choice boils down to the following: a database instance > having either 1/ a large number of small tables or 2/ large tables. > From my experience, I have seen both cases but mostly case 2 (others > can correct me). In this context, having an XID age based slot > invalidation check once per relation makes sense. However, I'm open to > more thoughts here. ISTM that checking the XID-based slot invalidation per table would be more bullet-proof and cover many cases. How about checking the XID-based slot invalidation opportunity only when the OldestXmin is older than the new GUC? For example, we can do this check in heap_vacuum_rel() based on the VacuumCutoffs returned by vacuum_get_cutoffs(). If we invalidate at least one slot for its XID, we can re-compute the OldestXmin. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-24 21:42 Bharath Rupireddy <[email protected]> parent: Masahiko Sawada <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-03-24 21:42 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Mon, Mar 23, 2026 at 4:36 PM Masahiko Sawada <[email protected]> wrote: > > I've studied the discussion on this thread and the patch. I understand > the purpose of this feature and agree that it's useful especially in > cases where orphaned (physical or logical) replication slots prevent > the xmin from advancing and inactive_since based slot invalidation > might not fit. > > And +1 for treating both the slot's xmin and catalog_xmin similarly > with the single GUC. Thanks for reviewing the patch. > > >> I made the following design choice: try invalidating only once per > > >> vacuum cycle, not per table. While this keeps the cost of checking > > >> (incl. the XidGenLock contention) for invalidation to a minimum when > > >> there are a large number of tables and replication slots, it can be > > >> less effective when individual tables/indexes are large. Invalidating > > >> during checkpoints can help to some extent with the large table/index > > >> cases. But I'm open to thoughts on this. > > > > > > It may not solve the intent when the vacuum cycle is longer, which one can expect on a large database particularly when there is heavy bloat. > > > > This design choice boils down to the following: a database instance > > having either 1/ a large number of small tables or 2/ large tables. > > From my experience, I have seen both cases but mostly case 2 (others > > can correct me). In this context, having an XID age based slot > > invalidation check once per relation makes sense. However, I'm open to > > more thoughts here. > > ISTM that checking the XID-based slot invalidation per table would be > more bullet-proof and cover many cases. How about checking the > XID-based slot invalidation opportunity only when the OldestXmin is > older than the new GUC? For example, we can do this check in > heap_vacuum_rel() based on the VacuumCutoffs returned by > vacuum_get_cutoffs(). If we invalidate at least one slot for its XID, > we can re-compute the OldestXmin. Agreed. Here's the patch that moves the XID-age based slot invalidation check to vacuum_get_cutoffs. This has some nice advantages: 1/ It makes the check once per table (to help with large tables). 2/ It makes the check less costly since we rely on already computed OldestXmin and nextXID values. 3/ It avoids the checkpointer to do XID-age based slot invalidation which keeps the usage of this GUC simple with no additional costs to the checkpointer - just the vacuum (both vacuum command and autovacuum) does the invalidation when needed. I moved the new tests to the existing TAP test file t/019_replslot_limit.pl alongside other invalidation tests. I added detailed comments around InvalidateXIDAgedReplicationSlots and slightly modified the docs. Please find the v3 patch for further review. PS: Thanks Sawada-san for the offlist chat. -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/octet-stream] v3-0001-Add-XID-age-based-replication-slot-invalidation.patch (25.5K, 2-v3-0001-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From 2fba7797aa9f3392a590ae85ca5db0809036dd5e Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Tue, 24 Mar 2026 20:42:24 +0000 Subject: [PATCH v3] Add XID age based replication slot invalidation Introduce max_slot_xid_age, a GUC that invalidates replication slots whose xmin or catalog_xmin exceeds the specified age. Disabled by default. Idle or forgotten replication slots can hold back vacuum, leading to bloat and eventually XID wraparound. In the worst case this requires dropping the slot and single-user mode vacuuming. This setting avoids that by proactively invalidating slots that have fallen too far behind. Invalidation checks are performed once per relation during vacuum (both vacuum command and autovacuum). --- doc/src/sgml/config.sgml | 29 +++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/transam/xlog.c | 5 +- src/backend/commands/vacuum.c | 16 ++ src/backend/replication/slot.c | 157 ++++++++++++++++- src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/replication/slot.h | 10 +- src/test/recovery/t/019_replslot_limit.pl | 166 ++++++++++++++++++ 10 files changed, 394 insertions(+), 10 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 8cdd826fbd3..e04b1384703 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,35 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <literal>xmin</literal> (the oldest + transaction that this slot needs the database to retain) or + <literal>catalog_xmin</literal> (the oldest transaction affecting the + system catalogs that this slot needs the database to retain) has reached + the age specified by this setting. This invalidation check happens + during vacuum (both <command>VACUUM</command> command and autovacuum). + A value of zero (which is default) disables this feature. Users can set + this value anywhere from zero to two billion. This parameter can only be + set in the <filename>postgresql.conf</filename> file or on the server + command line. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f5c9a34374d..d5d43620aad 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7443,7 +7443,7 @@ CreateCheckPoint(int flags) KeepLogSeg(recptr, &_logSegNo); if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, InvalidTransactionId)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -7900,7 +7900,7 @@ CreateRestartPoint(int flags) if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, InvalidTransactionId)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -8764,6 +8764,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index bce3a2daa24..252224c0bd6 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -48,6 +48,7 @@ #include "postmaster/autovacuum.h" #include "postmaster/bgworker_internals.h" #include "postmaster/interrupt.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/lmgr.h" #include "storage/pmsignal.h" @@ -1145,6 +1146,21 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams params, nextXID = ReadNextTransactionId(); nextMXID = ReadNextMultiXactId(); + /* + * Try to invalidate XID-aged replication slots that may interfere with + * vacuum's ability to freeze and remove dead tuples. Since OldestXmin + * already covers the slot xmin/catalog_xmin values, pass it as a + * preliminary check to avoid additional iteration over all the slots. + * + * If at least one slot was invalidated, recompute OldestXmin so that this + * vacuum benefits from the advanced horizon immediately. + */ + if (InvalidateXIDAgedReplicationSlots(cutoffs->OldestXmin, nextXID)) + { + cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); + } + /* * Also compute the multixact age for which freezing is urgent. This is * normally autovacuum_multixact_freeze_max_age, but may be less if diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..b6905e7bb43 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin greater + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -176,6 +183,8 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr; static void ReplicationSlotShmemExit(int code, Datum arg); static bool IsSlotForConflictCheck(const char *name); static void ReplicationSlotDropPtr(ReplicationSlot *slot); +static bool IsReplicationSlotXIDAged(TransactionId xmin, TransactionId catalog_xmin, + TransactionId nextXID); /* internal persistency functions */ static void RestoreSlotFromDisk(const char *name); @@ -1780,7 +1789,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId nextXID) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1837,36 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); + + if (TransactionIdIsValid(xmin)) + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's xmin %u at next transaction ID %u exceeds the age %d specified by \"%s\"."), + xmin, + nextXID, + max_slot_xid_age, + "max_slot_xid_age"); + } + else + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's catalog xmin %u at next transaction ID %u exceeds the age %d specified by \"%s\"."), + catalog_xmin, + nextXID, + max_slot_xid_age, + "max_slot_xid_age"); + } + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1874,6 +1916,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId nextXID, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1945,6 +1988,11 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && + IsReplicationSlotXIDAged(s->data.xmin, s->data.catalog_xmin, nextXID)) + return RS_INVAL_XID_AGE; + return RS_INVAL_NONE; } @@ -1967,6 +2015,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId nextXID, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2019,6 +2068,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + nextXID, &inactive_since, now); @@ -2112,7 +2162,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, nextXID); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2216,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, nextXID); /* done with this slot for now */ break; @@ -2192,6 +2244,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2205,7 +2259,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, TransactionId nextXID) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2244,7 +2298,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + nextXID, &released_lock)) { Assert(released_lock); @@ -3275,3 +3329,96 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Return true if xmin or catalog_xmin exceeds the max_slot_xid_age GUC. + */ +static bool +IsReplicationSlotXIDAged(TransactionId xmin, TransactionId catalog_xmin, + TransactionId nextXID) +{ + TransactionId cutoffXID; + bool aged = false; + + if (max_slot_xid_age == 0) + return false; + + if (!TransactionIdIsNormal(nextXID)) + return false; + + /* + * Calculate oldest XID a slot's xmin or catalog_xmin can have before they + * are invalidated. + */ + cutoffXID = nextXID - max_slot_xid_age; + + /* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */ + /* this can cause the limit to go backwards by 3, but that's OK */ + if (cutoffXID < FirstNormalTransactionId) + cutoffXID -= FirstNormalTransactionId; + + if (TransactionIdIsNormal(xmin) && + TransactionIdPrecedes(xmin, cutoffXID)) + aged = true; + + if (TransactionIdIsNormal(catalog_xmin) && + TransactionIdPrecedes(catalog_xmin, cutoffXID)) + aged = true; + + return aged; +} + +/* + * Invalidate replication slots whose XID age exceeds the max_slot_xid_age + * GUC. + * + * The caller supplies oldestXmin, either computed via + * GetOldestNonRemovableTransactionId during vacuum, or computed via the + * minimum of slot xmin values obtained from ProcArrayGetReplicationSlotXmin, + * and nextXID, the next XID to be assigned used to compute the age. + * + * A preliminary check based on the passed-in oldestXmin is done to avoid + * unnecessarily iterating over all the slots. For this check to be + * effective, oldestXmin must account for slot xmin/catalog_xmin values; if + * its age does not exceed the GUC then no individual slot can either, so the + * per-slot scan is skipped. For example, if oldestXmin is 100 and the GUC + * is 500, every slot's xmin must be >= 100, so none can be older than the + * GUC. + * + * Even if the caller passes an oldestXmin that does not include the slot + * xmin/catalog_xmin range, there is no risk of incorrect invalidation: each + * slot's own xmin and catalog_xmin are individually verified against the GUC + * inside IsReplicationSlotXIDAged(). The only downside is an additional + * iteration over all the slots. + * + * Returns true if at least one slot was invalidated. + */ +bool +InvalidateXIDAgedReplicationSlots(TransactionId oldestXmin, TransactionId nextXID) +{ + TransactionId cutoffXID; + bool invalidated = false; + + Assert(TransactionIdIsNormal(oldestXmin)); + + if (max_slot_xid_age == 0) + return false; + + cutoffXID = nextXID - max_slot_xid_age; + + /* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */ + /* this can cause the limit to go backwards by 3, but that's OK */ + if (!TransactionIdIsNormal(cutoffXID)) + cutoffXID = FirstNormalTransactionId; + + if (TransactionIdPrecedes(oldestXmin, cutoffXID)) + { + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, + InvalidOid, + InvalidTransactionId, + nextXID); + } + + return invalidated; +} diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index f3ad90c7c7a..26816d8b73c 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -503,7 +503,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 0c9854ad8fc..7fab02f1eeb 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2049,6 +2049,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index e4abe6c0077..0f728d87b6c 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..83cf8438724 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -367,7 +370,8 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId nextXID); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); @@ -387,4 +391,6 @@ extern bool SlotExistsInSyncStandbySlots(const char *slot_name); extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel); extern void WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn); +extern bool InvalidateXIDAgedReplicationSlots(TransactionId oldestXmin, TransactionId nextXID); + #endif /* SLOT_H */ diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..f4dfd0064c7 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,170 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Advance XIDs, run VACUUM, and wait for a slot to be invalidated due to XID age. +sub invalidate_slot_by_xid_age +{ + my ($node, $table_name, $slot_name, $slot_type, $nxids) = @_; + + # Do some work to advance xids + $node->safe_psql( + 'postgres', qq[ + do \$\$ + begin + for i in 1..$nxids loop + -- use an exception block so that each iteration eats an XID + begin + insert into $table_name values (i); + exception + when division_by_zero then null; + end; + end loop; + end\$\$; + ]); + + # Trigger slot invalidation via VACUUM + $node->safe_psql('postgres', "VACUUM"); + + # Wait for the replication slot to be invalidated due to XID age. + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) + or die + "Timed out while waiting for slot $slot_name to be invalidated"; + + ok(1, "$slot_type replication slot invalidated due to XID age"); +} + +# ============================================================================= +# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Configure primary with XID age settings +my $max_slot_xid_age = 500; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +}); + +$primary5->start; + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create a standby linking to the primary using the replication slot +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); + +# Enable HS feedback. The slot should gain an xmin. We set the status interval +# so we'll see the results promptly. +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +max_standby_streaming_delay = 3600000 +}); + +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); + +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); + +# Wait until standby has replayed enough data +$primary5->wait_for_catchup($standby5); + +$primary5->poll_query_until( + 'postgres', qq[ + SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL) + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin to advance"; + +# Read on standby that causes xmin to be held on slot +my $standby5_session = $standby5->interactive_psql('postgres'); +$standby5_session->query("BEGIN; SET default_transaction_isolation = 'repeatable read'; SELECT * FROM tab_int5;"); + +# Advance XIDs and wait for the slot to be invalidated due to XID age. +# Use 2x the max_slot_xid_age to ensure the slot's xmin age comfortably +# exceeds the configured limit. +invalidate_slot_by_xid_age($primary5, 'tab_int5', 'sb5_slot', 'physical', + 2 * $max_slot_xid_age); + +$standby5_session->quit; +$standby5->stop; + +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + +# ============================================================================= +# Testcase start: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. + +# Create a subscriber node +my $subscriber5 = PostgreSQL::Test::Cluster->new('subscriber5'); +$subscriber5->init(allows_streaming => 'logical'); +$subscriber5->start; + +# Create tables on both primary and subscriber +$primary5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); +$subscriber5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); + +# Insert some initial data +$primary5->safe_psql('postgres', + "INSERT INTO test_tbl5 VALUES (generate_series(1, 5));"); + +# Setup logical replication +my $primary5_connstr = $primary5->connstr . ' dbname=postgres'; +$primary5->safe_psql('postgres', + "CREATE PUBLICATION pub5 FOR TABLE test_tbl5"); + +$subscriber5->safe_psql('postgres', + "CREATE SUBSCRIPTION sub5 CONNECTION '$primary5_connstr' PUBLICATION pub5 WITH (slot_name = 'lsub5_slot')" +); + +# Wait for initial sync to complete +$subscriber5->wait_for_subscription_sync($primary5, 'sub5'); + +$result = $subscriber5->safe_psql('postgres', "SELECT count(*) FROM test_tbl5"); +is($result, qq(5), "check initial copy was done for logical replication (XID age test)"); + +# Wait for the logical slot to get catalog_xmin (logical slots use catalog_xmin, not xmin) +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NULL AND catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin to advance"; + +# Stop subscriber to make the replication slot on primary inactive +$subscriber5->stop; + +# Advance XIDs and wait for the slot to be invalidated due to XID age. +# Use 2x the max_slot_xid_age to ensure the slot's catalog_xmin age +# comfortably exceeds the configured limit. +invalidate_slot_by_xid_age($primary5, 'test_tbl5', 'lsub5_slot', 'logical', + 2 * $max_slot_xid_age); + +$primary5->stop; + +# Testcase end: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-25 06:50 Masahiko Sawada <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Masahiko Sawada @ 2026-03-25 06:50 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers On Tue, Mar 24, 2026 at 2:42 PM Bharath Rupireddy <[email protected]> wrote: > > Hi, > > On Mon, Mar 23, 2026 at 4:36 PM Masahiko Sawada <[email protected]> wrote: > > > > I've studied the discussion on this thread and the patch. I understand > > the purpose of this feature and agree that it's useful especially in > > cases where orphaned (physical or logical) replication slots prevent > > the xmin from advancing and inactive_since based slot invalidation > > might not fit. > > > > And +1 for treating both the slot's xmin and catalog_xmin similarly > > with the single GUC. > > Thanks for reviewing the patch. > > > > >> I made the following design choice: try invalidating only once per > > > >> vacuum cycle, not per table. While this keeps the cost of checking > > > >> (incl. the XidGenLock contention) for invalidation to a minimum when > > > >> there are a large number of tables and replication slots, it can be > > > >> less effective when individual tables/indexes are large. Invalidating > > > >> during checkpoints can help to some extent with the large table/index > > > >> cases. But I'm open to thoughts on this. > > > > > > > > It may not solve the intent when the vacuum cycle is longer, which one can expect on a large database particularly when there is heavy bloat. > > > > > > This design choice boils down to the following: a database instance > > > having either 1/ a large number of small tables or 2/ large tables. > > > From my experience, I have seen both cases but mostly case 2 (others > > > can correct me). In this context, having an XID age based slot > > > invalidation check once per relation makes sense. However, I'm open to > > > more thoughts here. > > > > ISTM that checking the XID-based slot invalidation per table would be > > more bullet-proof and cover many cases. How about checking the > > XID-based slot invalidation opportunity only when the OldestXmin is > > older than the new GUC? For example, we can do this check in > > heap_vacuum_rel() based on the VacuumCutoffs returned by > > vacuum_get_cutoffs(). If we invalidate at least one slot for its XID, > > we can re-compute the OldestXmin. > > Agreed. Here's the patch that moves the XID-age based slot > invalidation check to vacuum_get_cutoffs. This has some nice > advantages: 1/ It makes the check once per table (to help with large > tables). 2/ It makes the check less costly since we rely on already > computed OldestXmin and nextXID values. 3/ It avoids the checkpointer > to do XID-age based slot invalidation which keeps the usage of this > GUC simple with no additional costs to the checkpointer - just the > vacuum (both vacuum command and autovacuum) does the invalidation when > needed. > > I moved the new tests to the existing TAP test file > t/019_replslot_limit.pl alongside other invalidation tests. > > I added detailed comments around InvalidateXIDAgedReplicationSlots and > slightly modified the docs. > > Please find the v3 patch for further review. Thank you for updating the patch. I think the patch is reasonably simple and can avoid unnecessary overheads well due to XID-based checks. Here are some comments: + /* + * Try to invalidate XID-aged replication slots that may interfere with + * vacuum's ability to freeze and remove dead tuples. Since OldestXmin + * already covers the slot xmin/catalog_xmin values, pass it as a + * preliminary check to avoid additional iteration over all the slots. + * + * If at least one slot was invalidated, recompute OldestXmin so that this + * vacuum benefits from the advanced horizon immediately. + */ + if (InvalidateXIDAgedReplicationSlots(cutoffs->OldestXmin, nextXID)) + { + cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); + } vacuum_get_cutoff() is also called by VACUUM FULL, CLUSTER, and REPACK. I'm not sure that users would expect the slot invalidation also in these commands. I think it's better to leave vacuum_get_cutoff() a pure cutoff computation function and we can try to invalidate slots in heap_vacuum_rel(). It requires additional ReadNextTransactionId() but we can live with it, or we can make vacuum_get_cutoffs() return the nextXID as well (stored in *cutoffs). --- + /* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */ + /* this can cause the limit to go backwards by 3, but that's OK */ + if (!TransactionIdIsNormal(cutoffXID)) + cutoffXID = FirstNormalTransactionId; + + if (TransactionIdPrecedes(oldestXmin, cutoffXID)) + { + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, + InvalidOid, + InvalidTransactionId, + nextXID); + } I think it's better to check the procArray->replication_slot_xmin and procArray->replication_slot_catalog_xmin before iterating over each slot. Otherwise, we would end up checking every slot even when a long running transaction holds the oldestxmin back. --- + if (cutoffXID < FirstNormalTransactionId) + cutoffXID -= FirstNormalTransactionId; and + if (!TransactionIdIsNormal(cutoffXID)) + cutoffXID = FirstNormalTransactionId; These codes have the same comment but are doing a slightly different thing. I guess the latter is missing '-'? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-25 19:17 Bharath Rupireddy <[email protected]> parent: Masahiko Sawada <[email protected]> 0 siblings, 3 replies; 31+ messages in thread From: Bharath Rupireddy @ 2026-03-25 19:17 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Tue, Mar 24, 2026 at 11:50 PM Masahiko Sawada <[email protected]> wrote: > > > Please find the v3 patch for further review. > > Thank you for updating the patch. I think the patch is reasonably > simple and can avoid unnecessary overheads well due to XID-based > checks. Here are some comments: Thank you for reviewing the patch. > vacuum_get_cutoff() is also called by VACUUM FULL, CLUSTER, and > REPACK. I'm not sure that users would expect the slot invalidation > also in these commands. I think it's better to leave > vacuum_get_cutoff() a pure cutoff computation function and we can try > to invalidate slots in heap_vacuum_rel(). It requires additional > ReadNextTransactionId() but we can live with it, or we can make > vacuum_get_cutoffs() return the nextXID as well (stored in *cutoffs). +1. I chose to perform the slot invalidation in heap_vacuum_rel by getting the next txn ID and calling vacuum_get_cutoffs again when a slot gets invalidated. IMHO, this is simple than adding a flag and do the invalidation selectively in vacuum_get_cutoffs. > if (TransactionIdPrecedes(oldestXmin, cutoffXID)) > + { > + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, > + 0, > + InvalidOid, > + InvalidTransactionId, > + nextXID); > + } > > I think it's better to check the procArray->replication_slot_xmin and > procArray->replication_slot_catalog_xmin before iterating over each > slot. Otherwise, we would end up checking every slot even when a long > running transaction holds the oldestxmin back. +1. Changed. > + if (!TransactionIdIsNormal(cutoffXID)) > + cutoffXID = FirstNormalTransactionId; > > These codes have the same comment but are doing a slightly different > thing. I guess the latter is missing '-'? Fixed the typo. I fixed a test error being reported in CI. Please find the attached v4 patch for further review. I've also attached the 0002 patch that adds a test case to demo a production-like scenario by pushing the database to XID wraparound limits and checking if the XID-age based invalidation with the GUC setting at the default vacuum_failsafe_age of 1.6B works correctly, and whether autovacuum can successfully remove this replication slot blocker to proceed with freezing and bring the database back to normal. I don't intend to get this committed unless others think otherwise, but I wanted to have this as a reference. -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/x-patch] v4-0002-Add-more-tests-for-XID-age-slot-invalidation.patch (7.4K, 2-v4-0002-Add-more-tests-for-XID-age-slot-invalidation.patch) download | inline diff: From 75a9c4562a0166f0d612a00a50597a40172adec8 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Wed, 25 Mar 2026 18:44:17 +0000 Subject: [PATCH v4 2/2] Add more tests for XID age slot invalidation Consume XIDs up to wraparound WARNING limits with max_slot_xid_age matching vacuum_failsafe_age (1.6B). Verify that autovacuum invalidates the inactive replication slot (XID-age-based invalidation), unblocks datfrozenxid advancement, and prevents wraparound without any intervention. --- src/test/recovery/Makefile | 3 +- src/test/recovery/t/019_replslot_limit.pl | 162 ++++++++++++++++++++++ 2 files changed, 164 insertions(+), 1 deletion(-) diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile index d41aaaf8ae1..5c3d2c89941 100644 --- a/src/test/recovery/Makefile +++ b/src/test/recovery/Makefile @@ -12,7 +12,8 @@ EXTRA_INSTALL=contrib/pg_prewarm \ contrib/pg_stat_statements \ contrib/test_decoding \ - src/test/modules/injection_points + src/test/modules/injection_points \ + src/test/modules/xid_wraparound subdir = src/test/recovery top_builddir = ../../.. diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index c87657451bd..220bcfae3ee 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -703,4 +703,166 @@ $primary5->stop; # GUC. # ============================================================================= +# ============================================================================= +# Testcase: XID-age-based slot invalidation in a production-like scenario. +# Standby sets slot xmin via HS feedback, disconnects, XIDs consumed. +# Autovacuum automatically invalidates the slot once its xmin age exceeds +# max_slot_xid_age, advances datfrozenxid in all databases, and keeps the +# system healthy — no manual VACUUM, vacuumdb, or downtime needed. + +# Check if autovacuum has invalidated the slot due to xid_aged. +# Returns 1 if invalidated, 0 otherwise. Early exit when max_slot_xid_age = 0. +sub check_slot_invalidated +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + return 0 if $max_age == 0; + + my $reason = $node->safe_psql('postgres', + "SELECT invalidation_reason FROM pg_replication_slots WHERE slot_name = '$slot_name'"); + if ($reason eq 'xid_aged') + { + diag "Slot invalidated by autovacuum after consuming $consumed_xids XIDs"; + return 1; + } + return 0; +} + +# Verify server log shows slot invalidation by autovacuum worker with +# correct xmin, age, and next txid values. +sub verify_slot_xid_aged_invalidation +{ + my ($node, $slot_name, $slot_xmin, $max_age, $consumed_xids) = @_; + + my $log = slurp_file($node->logfile); + + # Verify the invalidation was performed by an autovacuum worker. + like($log, + qr/autovacuum worker\[\d+\] LOG:\s+invalidating obsolete replication slot "$slot_name"/, + "server log: $slot_name invalidated by autovacuum worker"); + + # Verify DETAIL shows the correct xmin and max_slot_xid_age. + like($log, + qr/autovacuum worker\[\d+\] DETAIL:\s+The slot's xmin $slot_xmin at next transaction ID (\d+) exceeds the age $max_age specified by "max_slot_xid_age"/, + "server log: DETAIL shows xmin $slot_xmin and age $max_age"); + + # Extract next txid from the log and report for diagnostics. + $log =~ + /The slot's xmin $slot_xmin at next transaction ID (\d+) exceeds/; + my $log_next_txid = $1 // 'N/A'; + diag "next_txid from server log=$log_next_txid, max_slot_xid_age=$max_age, consumed=$consumed_xids XIDs"; +} + +# Verify slot was invalidated and wait for autovacuum to advance datfrozenxid +# in all databases. Early exit when max_slot_xid_age = 0. +sub verify_invalidation_and_recovery +{ + my ($node, $slot_name, $slot_xmin, $max_age, $consumed_xids, $slot_gone) = @_; + + return if $max_age == 0; + + ok($slot_gone, 'autovacuum invalidated slot due to xid_aged'); + + verify_slot_xid_aged_invalidation($node, $slot_name, + $slot_xmin, $max_age, $consumed_xids); + + # Wait for autovacuum to advance datfrozenxid in all databases past the + # wraparound danger zone — no manual intervention required. + $node->poll_query_until( + 'postgres', qq[ + SELECT NOT EXISTS ( + SELECT 1 FROM pg_database + WHERE age(datfrozenxid) > 2000000000 + ); + ]) or die "Timed out waiting for autovacuum to advance datfrozenxid in all databases"; +} + +my $primary6 = PostgreSQL::Test::Cluster->new('primary6'); +$primary6->init(allows_streaming => 'logical'); + +$max_slot_xid_age = 1600000000; # matches vacuum_failsafe_age default +$primary6->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = 1s +}); + +$primary6->start; +$primary6->safe_psql('postgres', "CREATE EXTENSION xid_wraparound"); + +$backup_name = 'backup6'; +$primary6->backup($backup_name); + +my $standby6 = PostgreSQL::Test::Cluster->new('standby6'); +$standby6->init_from_backup($primary6, $backup_name, has_streaming => 1); +$standby6->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb6_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary6->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb6_slot', true)"); + +$standby6->start; + +$primary6->safe_psql('postgres', + "CREATE TABLE tab_int6 AS SELECT generate_series(1,10) AS a"); +$primary6->wait_for_catchup($standby6); + +$primary6->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL FROM pg_replication_slots + WHERE slot_name = 'sb6_slot'; +]) or die "Timed out waiting for sb6_slot xmin from HS feedback"; + +$result = $primary6->safe_psql('postgres', + "SELECT xmin IS NOT NULL FROM pg_replication_slots WHERE slot_name = 'sb6_slot'"); +is($result, 't', 'slot has xmin from hot_standby_feedback'); + +# Capture the slot's xmin for later log verification. +my $slot_xmin = $primary6->safe_psql('postgres', + "SELECT xmin FROM pg_replication_slots WHERE slot_name = 'sb6_slot'"); + +# Stop standby; slot xmin persists and holds back datfrozenxid. +$standby6->stop; + +# Consume XIDs in 50M chunks. Once we exceed max_slot_xid_age, autovacuum +# (naptime=1s) should automatically invalidate the slot. Keep consuming +# until we see that happen — no manual VACUUM or downtime needed. +my $logstart6 = -s $primary6->logfile; +my $chunk = 50_000_000; +my $max_xids = 2_200_000_000; +my $consumed = 0; +my $slot_gone = 0; + +while ($consumed < $max_xids) +{ + $primary6->safe_psql('postgres', "SELECT consume_xids($chunk)"); + $consumed += $chunk; + my $remaining = $max_xids - $consumed; + diag "Consumed $consumed / $max_xids XIDs ($remaining remaining)"; + + if (!$slot_gone && check_slot_invalidated($primary6, 'sb6_slot', + $max_slot_xid_age, $consumed)) + { + $slot_gone = 1; + } +} + +verify_invalidation_and_recovery($primary6, 'sb6_slot', + $slot_xmin, $max_slot_xid_age, $consumed, $slot_gone); + +# Consume 1B more XIDs — combining with the 2.2B consumed above, the total +# of 3.2B exceeds the 2^31 (~2.1B) usable XID space (xidStopLimit), i.e. +# more than one full wraparound cycle, proving the system is healthy. +$primary6->safe_psql('postgres', "SELECT consume_xids(1000000000)"); +ok(1, 'writes succeed after autovacuum invalidated the slot'); + +$primary6->stop; + +# Testcase end: XID-age-based slot invalidation in a production-like scenario. +# ============================================================================= + done_testing(); -- 2.47.3 [application/x-patch] v4-0001-Add-XID-age-based-replication-slot-invalidation.patch (25.9K, 3-v4-0001-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From 84fa3816181a5817228b8d4690c484eb8f1650ba Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Wed, 25 Mar 2026 18:41:58 +0000 Subject: [PATCH v4 1/2] Add XID age based replication slot invalidation Introduce max_slot_xid_age, a GUC that invalidates replication slots whose xmin or catalog_xmin exceeds the specified age. Disabled by default. Idle or forgotten replication slots can hold back vacuum, leading to bloat and eventually XID wraparound. In the worst case this requires dropping the slot and single-user mode vacuuming. This setting avoids that by proactively invalidating slots that have fallen too far behind. Invalidation checks are performed once per relation during vacuum (both vacuum command and autovacuum). --- doc/src/sgml/config.sgml | 29 +++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/heap/vacuumlazy.c | 18 ++ src/backend/access/transam/xlog.c | 5 +- src/backend/replication/slot.c | 166 +++++++++++++++++- src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/replication/slot.h | 10 +- src/test/recovery/t/019_replslot_limit.pl | 163 +++++++++++++++++ 10 files changed, 402 insertions(+), 10 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 8cdd826fbd3..e04b1384703 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,35 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <literal>xmin</literal> (the oldest + transaction that this slot needs the database to retain) or + <literal>catalog_xmin</literal> (the oldest transaction affecting the + system catalogs that this slot needs the database to retain) has reached + the age specified by this setting. This invalidation check happens + during vacuum (both <command>VACUUM</command> command and autovacuum). + A value of zero (which is default) disables this feature. Users can set + this value anywhere from zero to two billion. This parameter can only be + set in the <filename>postgresql.conf</filename> file or on the server + command line. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index f698c2d899b..07c897d5b27 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -147,6 +147,7 @@ #include "pgstat.h" #include "portability/instr_time.h" #include "postmaster/autovacuum.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/freespace.h" #include "storage/latch.h" @@ -799,6 +800,23 @@ heap_vacuum_rel(Relation rel, const VacuumParams params, * to increase the number of dead tuples it can prune away.) */ vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + + /* + * Try to invalidate XID-aged replication slots that may interfere with + * vacuum's ability to freeze and remove dead tuples. Since OldestXmin + * already covers the slot xmin/catalog_xmin values, pass it as a + * preliminary check to avoid additional iteration over all the slots. + * + * If at least one slot was invalidated, recompute cutoffs so that this + * vacuum benefits from the advanced horizon immediately. + * + * XXX: Next XID could be returned as output from vacuum_get_cutoffs() but + * for now we live with an additional ReadNextTransactionId() call. + */ + if (InvalidateXIDAgedReplicationSlots(vacrel->cutoffs.OldestXmin, + ReadNextTransactionId())) + vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f5c9a34374d..d5d43620aad 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7443,7 +7443,7 @@ CreateCheckPoint(int flags) KeepLogSeg(recptr, &_logSegNo); if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, InvalidTransactionId)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -7900,7 +7900,7 @@ CreateRestartPoint(int flags) if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, InvalidTransactionId)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -8764,6 +8764,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..8ca5d86fe0e 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin greater + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -176,6 +183,8 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr; static void ReplicationSlotShmemExit(int code, Datum arg); static bool IsSlotForConflictCheck(const char *name); static void ReplicationSlotDropPtr(ReplicationSlot *slot); +static bool IsReplicationSlotXIDAged(TransactionId xmin, TransactionId catalog_xmin, + TransactionId nextXID); /* internal persistency functions */ static void RestoreSlotFromDisk(const char *name); @@ -1780,7 +1789,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId nextXID) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1837,36 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); + + if (TransactionIdIsValid(xmin)) + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's xmin %u at next transaction ID %u exceeds the age %d specified by \"%s\"."), + xmin, + nextXID, + max_slot_xid_age, + "max_slot_xid_age"); + } + else + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's catalog xmin %u at next transaction ID %u exceeds the age %d specified by \"%s\"."), + catalog_xmin, + nextXID, + max_slot_xid_age, + "max_slot_xid_age"); + } + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1874,6 +1916,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId nextXID, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1945,6 +1988,11 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && + IsReplicationSlotXIDAged(s->data.xmin, s->data.catalog_xmin, nextXID)) + return RS_INVAL_XID_AGE; + return RS_INVAL_NONE; } @@ -1967,6 +2015,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId nextXID, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2019,6 +2068,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + nextXID, &inactive_since, now); @@ -2112,7 +2162,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, nextXID); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2216,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, nextXID); /* done with this slot for now */ break; @@ -2192,6 +2244,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2205,7 +2259,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, TransactionId nextXID) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2244,7 +2298,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + nextXID, &released_lock)) { Assert(released_lock); @@ -3275,3 +3329,105 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Check if the passed-in xmin or catalog_xmin have aged beyond the + * max_slot_xid_age GUC limit relative to nextXID. + * + * Returns true if either value exceeds the configured age. + */ +static bool +IsReplicationSlotXIDAged(TransactionId xmin, TransactionId catalog_xmin, + TransactionId nextXID) +{ + TransactionId cutoffXID; + bool aged = false; + + if (max_slot_xid_age == 0) + return false; + + if (!TransactionIdIsNormal(nextXID)) + return false; + + cutoffXID = nextXID - max_slot_xid_age; + + /* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */ + /* this can cause the limit to go backwards by 3, but that's OK */ + if (cutoffXID < FirstNormalTransactionId) + cutoffXID -= FirstNormalTransactionId; + + if (TransactionIdIsNormal(xmin) && + TransactionIdPrecedes(xmin, cutoffXID)) + aged = true; + + if (TransactionIdIsNormal(catalog_xmin) && + TransactionIdPrecedes(catalog_xmin, cutoffXID)) + aged = true; + + return aged; +} + +/* + * Invalidate replication slots whose XID age exceeds the max_slot_xid_age + * GUC. + * + * The caller supplies oldestXmin, either computed via + * GetOldestNonRemovableTransactionId during vacuum, or computed via the + * minimum of slot xmin values obtained from ProcArrayGetReplicationSlotXmin, + * and nextXID, the next XID to be assigned used to compute the age. + * + * Preliminary checks based on the passed-in oldestXmin and the oldest slot + * xmin and catalog_xmin are done to avoid unnecessarily iterating over all + * the slots. If the oldestXmin age does not exceed the GUC then no + * individual slot can either, so the per-slot scan is skipped. For example, + * if oldestXmin is 100 and the GUC is 500, every slot's xmin must be >= 100, + * so none can be older than the GUC. Similarly, if the oldest slot xmin and + * catalog_xmin from ProcArray are not aged, the per-slot scan is skipped; + * this can happen when a long-running transaction holds the oldestXmin back. + * + * Even if the caller passes an oldestXmin that does not include the slot + * xmin/catalog_xmin range, there is no risk of incorrect invalidation: each + * slot's own xmin and catalog_xmin are individually verified against the GUC + * inside IsReplicationSlotXIDAged(). The only downside is an additional + * iteration over all the slots. + * + * Returns true if at least one slot was invalidated. + */ +bool +InvalidateXIDAgedReplicationSlots(TransactionId oldestXmin, TransactionId nextXID) +{ + TransactionId cutoffXID; + bool invalidated = false; + + if (max_slot_xid_age == 0) + return false; + + if (!TransactionIdIsNormal(oldestXmin) || !TransactionIdIsNormal(nextXID)) + return false; + + cutoffXID = nextXID - max_slot_xid_age; + + /* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */ + /* this can cause the limit to go backwards by 3, but that's OK */ + if (!TransactionIdIsNormal(cutoffXID)) + cutoffXID -= FirstNormalTransactionId; + + if (TransactionIdPrecedes(oldestXmin, cutoffXID)) + { + TransactionId slot_xmin; + TransactionId slot_catalog_xmin; + + ProcArrayGetReplicationSlotXmin(&slot_xmin, &slot_catalog_xmin); + + if (IsReplicationSlotXIDAged(slot_xmin, slot_catalog_xmin, nextXID)) + { + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, + InvalidOid, + InvalidTransactionId, + nextXID); + } + } + + return invalidated; +} diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index de9092fdf5b..d60f39ec08e 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -504,7 +504,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 0c9854ad8fc..7fab02f1eeb 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2049,6 +2049,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index e4abe6c0077..0f728d87b6c 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..83cf8438724 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -367,7 +370,8 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId nextXID); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); @@ -387,4 +391,6 @@ extern bool SlotExistsInSyncStandbySlots(const char *slot_name); extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel); extern void WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn); +extern bool InvalidateXIDAgedReplicationSlots(TransactionId oldestXmin, TransactionId nextXID); + #endif /* SLOT_H */ diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..c87657451bd 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,167 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Advance XIDs, run VACUUM, and wait for a slot to be invalidated due to XID age. +sub invalidate_slot_by_xid_age +{ + my ($node, $table_name, $slot_name, $slot_type, $nxids) = @_; + + # Do some work to advance xids + $node->safe_psql( + 'postgres', qq[ + do \$\$ + begin + for i in 1..$nxids loop + -- use an exception block so that each iteration eats an XID + begin + insert into $table_name values (i); + exception + when division_by_zero then null; + end; + end loop; + end\$\$; + ]); + + # Trigger slot invalidation via VACUUM + $node->safe_psql('postgres', "VACUUM"); + + # Wait for the replication slot to be invalidated due to XID age. + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) + or die + "Timed out while waiting for slot $slot_name to be invalidated"; + + ok(1, "$slot_type replication slot invalidated due to XID age"); +} + +# ============================================================================= +# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Configure primary with XID age settings +my $max_slot_xid_age = 500; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +}); + +$primary5->start; + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create a standby linking to the primary using the replication slot +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); + +# Enable HS feedback. The slot should gain an xmin. We set the status interval +# so we'll see the results promptly. +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); + +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); + +# Wait until standby has replayed enough data +$primary5->wait_for_catchup($standby5); + +# Wait for the slot to get xmin from hot_standby_feedback +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin from HS feedback"; + +# Stop standby to make the replication slot on primary inactive. +# The slot's xmin persists and holds back datfrozenxid. +$standby5->stop; + +# Advance XIDs and wait for the slot to be invalidated due to XID age. +# Use 2x the max_slot_xid_age to ensure the slot's xmin age comfortably +# exceeds the configured limit. +invalidate_slot_by_xid_age($primary5, 'tab_int5', 'sb5_slot', 'physical', + 2 * $max_slot_xid_age); + +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + +# ============================================================================= +# Testcase start: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. + +# Create a subscriber node +my $subscriber5 = PostgreSQL::Test::Cluster->new('subscriber5'); +$subscriber5->init(allows_streaming => 'logical'); +$subscriber5->start; + +# Create tables on both primary and subscriber +$primary5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); +$subscriber5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); + +# Insert some initial data +$primary5->safe_psql('postgres', + "INSERT INTO test_tbl5 VALUES (generate_series(1, 5));"); + +# Setup logical replication +my $primary5_connstr = $primary5->connstr . ' dbname=postgres'; +$primary5->safe_psql('postgres', + "CREATE PUBLICATION pub5 FOR TABLE test_tbl5"); + +$subscriber5->safe_psql('postgres', + "CREATE SUBSCRIPTION sub5 CONNECTION '$primary5_connstr' PUBLICATION pub5 WITH (slot_name = 'lsub5_slot')" +); + +# Wait for initial sync to complete +$subscriber5->wait_for_subscription_sync($primary5, 'sub5'); + +$result = $subscriber5->safe_psql('postgres', "SELECT count(*) FROM test_tbl5"); +is($result, qq(5), "check initial copy was done for logical replication (XID age test)"); + +# Wait for the logical slot to get catalog_xmin (logical slots use catalog_xmin, not xmin) +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NULL AND catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin to advance"; + +# Stop subscriber to make the replication slot on primary inactive +$subscriber5->stop; + +# Advance XIDs and wait for the slot to be invalidated due to XID age. +# Use 2x the max_slot_xid_age to ensure the slot's catalog_xmin age +# comfortably exceeds the configured limit. +invalidate_slot_by_xid_age($primary5, 'test_tbl5', 'lsub5_slot', 'logical', + 2 * $max_slot_xid_age); + +$primary5->stop; + +# Testcase end: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-26 09:48 SATYANARAYANA NARLAPURAM <[email protected]> parent: Bharath Rupireddy <[email protected]> 2 siblings, 0 replies; 31+ messages in thread From: SATYANARAYANA NARLAPURAM @ 2026-03-26 09:48 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers On Wed, Mar 25, 2026 at 12:17 PM Bharath Rupireddy < [email protected]> wrote: > Hi, > > On Tue, Mar 24, 2026 at 11:50 PM Masahiko Sawada <[email protected]> > wrote: > > > > > Please find the v3 patch for further review. > > > > Thank you for updating the patch. I think the patch is reasonably > > simple and can avoid unnecessary overheads well due to XID-based > > checks. Here are some comments: > > Thank you for reviewing the patch. > > > vacuum_get_cutoff() is also called by VACUUM FULL, CLUSTER, and > > REPACK. I'm not sure that users would expect the slot invalidation > > also in these commands. I think it's better to leave > > vacuum_get_cutoff() a pure cutoff computation function and we can try > > to invalidate slots in heap_vacuum_rel(). It requires additional > > ReadNextTransactionId() but we can live with it, or we can make > > vacuum_get_cutoffs() return the nextXID as well (stored in *cutoffs). > > +1. I chose to perform the slot invalidation in heap_vacuum_rel by > getting the next txn ID and calling vacuum_get_cutoffs again when a > slot gets invalidated. IMHO, this is simple than adding a flag and do > the invalidation selectively in vacuum_get_cutoffs. > > > if (TransactionIdPrecedes(oldestXmin, cutoffXID)) > > + { > > + invalidated = > InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, > > + 0, > > + InvalidOid, > > + > InvalidTransactionId, > > + nextXID); > > + } > > > > I think it's better to check the procArray->replication_slot_xmin and > > procArray->replication_slot_catalog_xmin before iterating over each > > slot. Otherwise, we would end up checking every slot even when a long > > running transaction holds the oldestxmin back. > > +1. Changed. > > > + if (!TransactionIdIsNormal(cutoffXID)) > > + cutoffXID = FirstNormalTransactionId; > > > > These codes have the same comment but are doing a slightly different > > thing. I guess the latter is missing '-'? > > Fixed the typo. > > I fixed a test error being reported in CI. > > Please find the attached v4 patch for further review. > InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, TransactionId nextXID) May be add TransactionId nextXID in a new line? Thinking loud, vacuum doesn't run on a hot_standby, that means this GUC is not applicable for hot_standby. Is this intended? Why not call during checkpoint/restorepoint itself like other slot invalidation checks? Thanks, Satya ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-26 10:42 SATYANARAYANA NARLAPURAM <[email protected]> parent: Bharath Rupireddy <[email protected]> 2 siblings, 0 replies; 31+ messages in thread From: SATYANARAYANA NARLAPURAM @ 2026-03-26 10:42 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Wed, Mar 25, 2026 at 12:17 PM Bharath Rupireddy < [email protected]> wrote: > Hi, > > On Tue, Mar 24, 2026 at 11:50 PM Masahiko Sawada <[email protected]> > wrote: > > > > > Please find the v3 patch for further review. > > > > Thank you for updating the patch. I think the patch is reasonably > > simple and can avoid unnecessary overheads well due to XID-based > > checks. Here are some comments: > > Thank you for reviewing the patch. > > > vacuum_get_cutoff() is also called by VACUUM FULL, CLUSTER, and > > REPACK. I'm not sure that users would expect the slot invalidation > > also in these commands. I think it's better to leave > > vacuum_get_cutoff() a pure cutoff computation function and we can try > > to invalidate slots in heap_vacuum_rel(). It requires additional > > ReadNextTransactionId() but we can live with it, or we can make > > vacuum_get_cutoffs() return the nextXID as well (stored in *cutoffs). > > +1. I chose to perform the slot invalidation in heap_vacuum_rel by > getting the next txn ID and calling vacuum_get_cutoffs again when a > slot gets invalidated. IMHO, this is simple than adding a flag and do > the invalidation selectively in vacuum_get_cutoffs. > > > if (TransactionIdPrecedes(oldestXmin, cutoffXID)) > > + { > > + invalidated = > InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, > > + 0, > > + InvalidOid, > > + > InvalidTransactionId, > > + nextXID); > > + } > > > > I think it's better to check the procArray->replication_slot_xmin and > > procArray->replication_slot_catalog_xmin before iterating over each > > slot. Otherwise, we would end up checking every slot even when a long > > running transaction holds the oldestxmin back. > > +1. Changed. > > > + if (!TransactionIdIsNormal(cutoffXID)) > > + cutoffXID = FirstNormalTransactionId; > > > > These codes have the same comment but are doing a slightly different > > thing. I guess the latter is missing '-'? > > Fixed the typo. > > I fixed a test error being reported in CI. > > Please find the attached v4 patch for further review. > + if (InvalidateXIDAgedReplicationSlots(vacrel->cutoffs.OldestXmin, + ReadNextTransactionId())) Does this account catalog xmin for data tables? Thanks, Satya ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-26 21:49 Masahiko Sawada <[email protected]> parent: Bharath Rupireddy <[email protected]> 2 siblings, 1 reply; 31+ messages in thread From: Masahiko Sawada @ 2026-03-26 21:49 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers On Wed, Mar 25, 2026 at 12:17 PM Bharath Rupireddy <[email protected]> wrote: > > Hi, > > On Tue, Mar 24, 2026 at 11:50 PM Masahiko Sawada <[email protected]> wrote: > > > > > Please find the v3 patch for further review. > > > > Thank you for updating the patch. I think the patch is reasonably > > simple and can avoid unnecessary overheads well due to XID-based > > checks. Here are some comments: > > Thank you for reviewing the patch. > > > vacuum_get_cutoff() is also called by VACUUM FULL, CLUSTER, and > > REPACK. I'm not sure that users would expect the slot invalidation > > also in these commands. I think it's better to leave > > vacuum_get_cutoff() a pure cutoff computation function and we can try > > to invalidate slots in heap_vacuum_rel(). It requires additional > > ReadNextTransactionId() but we can live with it, or we can make > > vacuum_get_cutoffs() return the nextXID as well (stored in *cutoffs). > > +1. I chose to perform the slot invalidation in heap_vacuum_rel by > getting the next txn ID and calling vacuum_get_cutoffs again when a > slot gets invalidated. IMHO, this is simple than adding a flag and do > the invalidation selectively in vacuum_get_cutoffs. > > > if (TransactionIdPrecedes(oldestXmin, cutoffXID)) > > + { > > + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, > > + 0, > > + InvalidOid, > > + InvalidTransactionId, > > + nextXID); > > + } > > > > I think it's better to check the procArray->replication_slot_xmin and > > procArray->replication_slot_catalog_xmin before iterating over each > > slot. Otherwise, we would end up checking every slot even when a long > > running transaction holds the oldestxmin back. > > +1. Changed. > > > + if (!TransactionIdIsNormal(cutoffXID)) > > + cutoffXID = FirstNormalTransactionId; > > > > These codes have the same comment but are doing a slightly different > > thing. I guess the latter is missing '-'? > > Fixed the typo. > > I fixed a test error being reported in CI. > > Please find the attached v4 patch for further review. Thank you for updating the patch. I've reviewed the patch and have some review comments: + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's xmin %u at next transaction ID %u exceeds the age %d specified by \"%s\"."), + xmin, + nextXID, + max_slot_xid_age, + "max_slot_xid_age"); I think it's better to show the age of the slot's xmin instead of the recent XID. --- + + if (!TransactionIdIsNormal(oldestXmin) || !TransactionIdIsNormal(nextXID)) + return false; + Do we expect that the passed oldestXmin or nextXID could be non-normal XIDs? I think the function assumes these are valid XIDs. Also, since this function is called only by heap_vacuum_rel(), we can call ReadNextTransactionId() within this function. --- + if (IsReplicationSlotXIDAged(slot_xmin, slot_catalog_xmin, nextXID)) We compute the cutoff XID in IsReplicationSlotXIDAged() again, which seems redundant. I've attached the fixup patch addressing these comments and having some code cleanups. Please review it. I'm reviewing the regression test part, and will share review comments soon. > > I've also attached the 0002 patch that adds a test case to demo a > production-like scenario by pushing the database to XID wraparound > limits and checking if the XID-age based invalidation with the GUC > setting at the default vacuum_failsafe_age of 1.6B works correctly, > and whether autovacuum can successfully remove this replication slot > blocker to proceed with freezing and bring the database back to > normal. I don't intend to get this committed unless others think > otherwise, but I wanted to have this as a reference. Thank you for sharing the test script! I'll check it as well. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com Attachments: [text/x-patch] v4_cleanup_masahiko.patch (13.7K, 2-v4_cleanup_masahiko.patch) download | inline diff: diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 809061110c7..bc9360b9743 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -805,16 +805,15 @@ heap_vacuum_rel(Relation rel, const VacuumParams params, * vacuum's ability to freeze and remove dead tuples. Since OldestXmin * already covers the slot xmin/catalog_xmin values, pass it as a * preliminary check to avoid additional iteration over all the slots. - * - * If at least one slot was invalidated, recompute cutoffs so that this - * vacuum benefits from the advanced horizon immediately. - * - * XXX: Next XID could be returned as output from vacuum_get_cutoffs() but - * for now we live with an additional ReadNextTransactionId() call. */ - if (InvalidateXIDAgedReplicationSlots(vacrel->cutoffs.OldestXmin, - ReadNextTransactionId())) + if (MaybeInvalidateXIDAgedSlots(vacrel->cutoffs.OldestXmin)) + { + /* + * Some slots have been invalidated based on their XID age; recompute + * the vacuum cutoffs. + */ vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + } vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 8ca5d86fe0e..8296856eefe 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -160,7 +160,7 @@ int max_replication_slots = 10; /* the maximum number of replication int idle_replication_slot_timeout_secs = 0; /* - * Invalidate replication slots that have xmin or catalog_xmin greater + * Invalidate replication slots that have xmin or catalog_xmin older * than the specified age; '0' disables it. */ int max_slot_xid_age = 0; @@ -183,8 +183,6 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr; static void ReplicationSlotShmemExit(int code, Datum arg); static bool IsSlotForConflictCheck(const char *name); static void ReplicationSlotDropPtr(ReplicationSlot *slot); -static bool IsReplicationSlotXIDAged(TransactionId xmin, TransactionId catalog_xmin, - TransactionId nextXID); /* internal persistency functions */ static void RestoreSlotFromDisk(const char *name); @@ -1792,7 +1790,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, long slot_idle_seconds, TransactionId xmin, TransactionId catalog_xmin, - TransactionId nextXID) + TransactionId recentXid) { StringInfoData err_detail; StringInfoData err_hint; @@ -1845,20 +1843,14 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, if (TransactionIdIsValid(xmin)) { /* translator: %s is a GUC variable name */ - appendStringInfo(&err_detail, _("The slot's xmin %u at next transaction ID %u exceeds the age %d specified by \"%s\"."), - xmin, - nextXID, - max_slot_xid_age, - "max_slot_xid_age"); + appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + xmin, (int32) (recentXid - xmin), "max_slot_xid_age", max_slot_xid_age); } else { /* translator: %s is a GUC variable name */ - appendStringInfo(&err_detail, _("The slot's catalog xmin %u at next transaction ID %u exceeds the age %d specified by \"%s\"."), - catalog_xmin, - nextXID, - max_slot_xid_age, - "max_slot_xid_age"); + appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + catalog_xmin, (int32) (recentXid - catalog_xmin), "max_slot_xid_age", max_slot_xid_age); } /* translator: %s is a GUC variable name */ @@ -1905,6 +1897,25 @@ CanInvalidateIdleSlot(ReplicationSlot *s) !(RecoveryInProgress() && s->data.synced)); } +/* + * Can we invalidate an XID-aged replication slot? + * + * XID-aged based invalidation is allowed to the given slot when: + * + * 1. Max XID-age is set + * 2. Slot has valid xmin or catalog_xmin + * 3. The slot is not being synced from the primary while the server is in + * recovery. + */ +static inline bool +CanInvalidateXidAgedSlot(ReplicationSlot *s) +{ + return (max_slot_xid_age != 0 && + (TransactionIdIsValid(s->data.xmin) || + TransactionIdIsValid(s->data.catalog_xmin)) && + !(RecoveryInProgress() && s->data.synced)); +} + /* * DetermineSlotInvalidationCause - Determine the cause for which a slot * becomes invalid among the given possible causes. @@ -1916,7 +1927,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, - TransactionId nextXID, + TransactionId recentXid, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1989,9 +2000,23 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ - if ((possible_causes & RS_INVAL_XID_AGE) && - IsReplicationSlotXIDAged(s->data.xmin, s->data.catalog_xmin, nextXID)) - return RS_INVAL_XID_AGE; + if (possible_causes & RS_INVAL_XID_AGE) + { + Assert(TransactionIdIsValid(recentXid)); + + if (CanInvalidateXidAgedSlot(s)) + { + TransactionId xidLimit; + + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); + + if ((TransactionIdIsValid(s->data.xmin) && + TransactionIdPrecedes(s->data.xmin, xidLimit)) || + (TransactionIdIsValid(s->data.catalog_xmin) && + TransactionIdPrecedes(s->data.catalog_xmin, xidLimit))) + return RS_INVAL_XID_AGE; + } + } return RS_INVAL_NONE; } @@ -2015,7 +2040,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, - TransactionId nextXID, + TransactionId recentXid, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2068,7 +2093,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - nextXID, + recentXid, &inactive_since, now); @@ -2163,7 +2188,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, slot_idle_secs, s->data.xmin, - s->data.catalog_xmin, nextXID); + s->data.catalog_xmin, recentXid); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2217,7 +2242,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, slot_idle_secs, s->data.xmin, - s->data.catalog_xmin, nextXID); + s->data.catalog_xmin, recentXid); /* done with this slot for now */ break; @@ -2259,7 +2284,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon, TransactionId nextXID) + TransactionId snapshotConflictHorizon, + TransactionId recentXid) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2298,7 +2324,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - nextXID, &released_lock)) + recentXid, &released_lock)) { Assert(released_lock); @@ -3330,104 +3356,53 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } -/* - * Check if the passed-in xmin or catalog_xmin have aged beyond the - * max_slot_xid_age GUC limit relative to nextXID. - * - * Returns true if either value exceeds the configured age. - */ -static bool -IsReplicationSlotXIDAged(TransactionId xmin, TransactionId catalog_xmin, - TransactionId nextXID) -{ - TransactionId cutoffXID; - bool aged = false; - - if (max_slot_xid_age == 0) - return false; - - if (!TransactionIdIsNormal(nextXID)) - return false; - - cutoffXID = nextXID - max_slot_xid_age; - - /* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */ - /* this can cause the limit to go backwards by 3, but that's OK */ - if (cutoffXID < FirstNormalTransactionId) - cutoffXID -= FirstNormalTransactionId; - - if (TransactionIdIsNormal(xmin) && - TransactionIdPrecedes(xmin, cutoffXID)) - aged = true; - - if (TransactionIdIsNormal(catalog_xmin) && - TransactionIdPrecedes(catalog_xmin, cutoffXID)) - aged = true; - - return aged; -} - /* * Invalidate replication slots whose XID age exceeds the max_slot_xid_age * GUC. * - * The caller supplies oldestXmin, either computed via - * GetOldestNonRemovableTransactionId during vacuum, or computed via the - * minimum of slot xmin values obtained from ProcArrayGetReplicationSlotXmin, - * and nextXID, the next XID to be assigned used to compute the age. - * - * Preliminary checks based on the passed-in oldestXmin and the oldest slot - * xmin and catalog_xmin are done to avoid unnecessarily iterating over all - * the slots. If the oldestXmin age does not exceed the GUC then no - * individual slot can either, so the per-slot scan is skipped. For example, - * if oldestXmin is 100 and the GUC is 500, every slot's xmin must be >= 100, - * so none can be older than the GUC. Similarly, if the oldest slot xmin and - * catalog_xmin from ProcArray are not aged, the per-slot scan is skipped; - * this can happen when a long-running transaction holds the oldestXmin back. - * - * Even if the caller passes an oldestXmin that does not include the slot - * xmin/catalog_xmin range, there is no risk of incorrect invalidation: each - * slot's own xmin and catalog_xmin are individually verified against the GUC - * inside IsReplicationSlotXIDAged(). The only downside is an additional - * iteration over all the slots. + * The oldestXmin is expected to be a XID computed via + * GetOldestNonRemovableTransactionId() during vacuum. It is used as cutoffs + * for individual slot checks; if its age does not exceed the max_slot_xid_age, + * no individual slot can either, so we skip per-slot invalidation check. * * Returns true if at least one slot was invalidated. */ bool -InvalidateXIDAgedReplicationSlots(TransactionId oldestXmin, TransactionId nextXID) +MaybeInvalidateXIDAgedSlots(TransactionId oldestXmin) { - TransactionId cutoffXID; + TransactionId recentXid; + TransactionId xidLimit; + TransactionId slot_xmin = InvalidTransactionId; + TransactionId slot_catalog_xmin = InvalidTransactionId; bool invalidated = false; - if (max_slot_xid_age == 0) - return false; + Assert(TransactionIdIsValid(oldestXmin)); - if (!TransactionIdIsNormal(oldestXmin) || !TransactionIdIsNormal(nextXID)) + if (max_slot_xid_age == 0) return false; - cutoffXID = nextXID - max_slot_xid_age; + recentXid = ReadNextTransactionId(); + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); - /* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */ - /* this can cause the limit to go backwards by 3, but that's OK */ - if (!TransactionIdIsNormal(cutoffXID)) - cutoffXID -= FirstNormalTransactionId; - - if (TransactionIdPrecedes(oldestXmin, cutoffXID)) - { - TransactionId slot_xmin; - TransactionId slot_catalog_xmin; + /* oldestXmin is not behind the cutoff; no need to check slots */ + if (TransactionIdPrecedes(xidLimit, oldestXmin)) + return false; - ProcArrayGetReplicationSlotXmin(&slot_xmin, &slot_catalog_xmin); + ProcArrayGetReplicationSlotXmin(&slot_xmin, &slot_catalog_xmin); - if (IsReplicationSlotXIDAged(slot_xmin, slot_catalog_xmin, nextXID)) - { - invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, - 0, - InvalidOid, - InvalidTransactionId, - nextXID); - } - } + /* + * Invalidate possibly obsolete slots based on XID-age, if either slot's + * xmin or catalog_xmin is older than the cutoff. + */ + if ((TransactionIdIsValid(slot_xmin) && + TransactionIdPrecedes(slot_xmin, xidLimit)) || + (TransactionIdIsValid(slot_catalog_xmin) && + TransactionIdPrecedes(slot_catalog_xmin, xidLimit))) + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, + InvalidOid, + InvalidTransactionId, + recentXid); return invalidated; } diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 83cf8438724..c483f0f531b 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -371,7 +371,8 @@ extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, TransactionId snapshotConflictHorizon, - TransactionId nextXID); + TransactionId recentXid); +extern bool MaybeInvalidateXIDAgedSlots(TransactionId oldestXmin); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); @@ -391,6 +392,4 @@ extern bool SlotExistsInSyncStandbySlots(const char *slot_name); extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel); extern void WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn); -extern bool InvalidateXIDAgedReplicationSlots(TransactionId oldestXmin, TransactionId nextXID); - #endif /* SLOT_H */ ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-28 18:03 Bharath Rupireddy <[email protected]> parent: Masahiko Sawada <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-03-28 18:03 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Thu, Mar 26, 2026 at 2:50 PM Masahiko Sawada <[email protected]> wrote: > > Thank you for updating the patch. I've reviewed the patch and have > some review comments: Thank you for reviewing the patch. > + /* translator: %s is a GUC variable name */ > + appendStringInfo(&err_detail, _("The slot's xmin > %u at next transaction ID %u exceeds the age %d specified by > \"%s\"."), > + xmin, > + nextXID, > + max_slot_xid_age, > + "max_slot_xid_age"); > > I think it's better to show the age of the slot's xmin instead of the > recent XID. Agreed. > --- > + > + if (!TransactionIdIsNormal(oldestXmin) || !TransactionIdIsNormal(nextXID)) > + return false; > + > > Do we expect that the passed oldestXmin or nextXID could be non-normal > XIDs? I think the function assumes these are valid XIDs. The oldestXmin is now removed. Please see the responses at the end. > Also, since this function is called only by heap_vacuum_rel(), we can > call ReadNextTransactionId() within this function. Agreed. > --- > + if (IsReplicationSlotXIDAged(slot_xmin, slot_catalog_xmin, nextXID)) > > We compute the cutoff XID in IsReplicationSlotXIDAged() again, which > seems redundant. > > I've attached the fixup patch addressing these comments and having > some code cleanups. Please review it. The fixup patch looked good to me, I had that merged in the attached v5 patch. > I'm reviewing the regression test part, and will share review comments soon. > > > I've also attached the 0002 patch that adds a test case to demo a > > production-like scenario by pushing the database to XID wraparound > > limits and checking if the XID-age based invalidation with the GUC > > setting at the default vacuum_failsafe_age of 1.6B works correctly, > > and whether autovacuum can successfully remove this replication slot > > blocker to proceed with freezing and bring the database back to > > normal. I don't intend to get this committed unless others think > > otherwise, but I wanted to have this as a reference. > > Thank you for sharing the test script! I'll check it as well. Thank you. On Thu, Mar 26, 2026 at 3:42 AM SATYANARAYANA NARLAPURAM <[email protected]> wrote: > > Hi, > > + if (InvalidateXIDAgedReplicationSlots(vacrel->cutoffs.OldestXmin, > + ReadNextTransactionId())) > > Does this account catalog xmin for data tables? Nice catch! When vacuum runs on regular tables, it doesn't cover catalog_xmin in the OldestXmin. So if catalog_xmin is blocking relfrozenxid advancement, slot invalidation doesn't happen. I updated vacuum_get_cutoffs to return slot_catalog_xmin and slot_xmin. These values are already available in ComputeXidHorizons, so this doesn't require an additional proc-array lock. I also added support for XID age based slot invalidation during checkpoints. This helps standbys that can have replication slots but where vacuum doesn't run. (It skips synced slots, just like idle_replication_slot_timeout does.) Please find the attached v5 patches for further review. Thank you! -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/x-patch] v5-0001-Add-XID-age-based-replication-slot-invalidation.patch (30.8K, 2-v5-0001-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From 4089dafb79c4b06a375d8bb3d32323d2e4640b04 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Sat, 28 Mar 2026 06:52:43 +0000 Subject: [PATCH v5 1/2] Add XID age based replication slot invalidation Introduce max_slot_xid_age, a GUC that invalidates replication slots whose xmin or catalog_xmin exceeds the specified age. Disabled by default. Idle or forgotten replication slots can hold back vacuum, leading to bloat and eventually XID wraparound. In the worst case this requires dropping the slot and single-user mode vacuuming. This setting avoids that by proactively invalidating slots that have fallen too far behind. Invalidation checks are performed once per relation during vacuum (both vacuum command and autovacuum), and also by the checkpointer during checkpoints and restartpoints. --- doc/src/sgml/config.sgml | 40 ++++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/heap/vacuumlazy.c | 20 +- src/backend/access/transam/xlog.c | 20 +- src/backend/commands/cluster.c | 2 +- src/backend/commands/vacuum.c | 8 +- src/backend/replication/slot.c | 128 ++++++++++++- src/backend/storage/ipc/procarray.c | 19 ++ src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/commands/vacuum.h | 4 +- src/include/replication/slot.h | 10 +- src/include/storage/procarray.h | 3 + src/test/recovery/t/019_replslot_limit.pl | 175 ++++++++++++++++++ 15 files changed, 435 insertions(+), 15 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 229f41353eb..46aac59cb20 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <literal>xmin</literal> (the oldest + transaction that this slot needs the database to retain) or + <literal>catalog_xmin</literal> (the oldest transaction affecting the + system catalogs that this slot needs the database to retain) has reached + the age specified by this setting. This invalidation check happens + during vacuum (both <command>VACUUM</command> command and autovacuum) + and during checkpoints. + A value of zero (which is default) disables this feature. Users can set + this value anywhere from zero to two billion. This parameter can only be + set in the <filename>postgresql.conf</filename> file or on the server + command line. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + + <para> + Note that this invalidation mechanism is not applicable for slots + on the standby server that are being synced from the primary server + (i.e., standby slots having + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> + value <literal>true</literal>). Synced slots are always considered to + be inactive because they don't perform logical decoding to produce + changes. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index f698c2d899b..2e8e61eb0e2 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -147,6 +147,7 @@ #include "pgstat.h" #include "portability/instr_time.h" #include "postmaster/autovacuum.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/freespace.h" #include "storage/latch.h" @@ -642,6 +643,8 @@ heap_vacuum_rel(Relation rel, const VacuumParams params, ErrorContextCallback errcallback; char **indnames = NULL; Size dead_items_max_bytes = 0; + TransactionId slot_xmin = InvalidTransactionId; + TransactionId slot_catalog_xmin = InvalidTransactionId; verbose = (params.options & VACOPT_VERBOSE) != 0; instrument = (verbose || (AmAutoVacuumWorkerProcess() && @@ -798,7 +801,22 @@ heap_vacuum_rel(Relation rel, const VacuumParams params, * want to teach lazy_scan_prune to recompute vistest from time to time, * to increase the number of dead tuples it can prune away.) */ - vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs, + &slot_xmin, &slot_catalog_xmin); + + /* + * Try to invalidate XID-aged replication slots. Use the slot xmin values + * obtained from the same horizons computation that produced OldestXmin, + * avoiding an extra ProcArrayLock acquisition. + */ + if (MaybeInvalidateXIDAgedSlots(slot_xmin, slot_catalog_xmin)) + { + /* Recompute cutoffs after slot invalidation. */ + vacrel->aggressive = vacuum_get_cutoffs(rel, params, + &vacrel->cutoffs, + NULL, NULL); + } + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f5c9a34374d..cb943bdc2a9 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7019,6 +7019,7 @@ CreateCheckPoint(int flags) VirtualTransactionId *vxids; int nvxids; int oldXLogAllowed = 0; + uint32 possibleInvalidationCauses; /* * An end-of-recovery checkpoint is really a shutdown checkpoint, just @@ -7441,8 +7442,15 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | + RS_INVAL_XID_AGE; + + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, _logSegNo, InvalidOid, + InvalidTransactionId, + max_slot_xid_age > 0 ? + ReadNextTransactionId() : InvalidTransactionId)) { /* @@ -7724,6 +7732,7 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + uint32 possibleInvalidationCauses; /* Concurrent checkpoint/restartpoint cannot happen */ Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); @@ -7898,8 +7907,14 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | + RS_INVAL_XID_AGE; + + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, _logSegNo, InvalidOid, + InvalidTransactionId, + max_slot_xid_age > 0 ? + ReadNextTransactionId() : InvalidTransactionId)) { /* @@ -8764,6 +8779,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 09066db0956..118d5d28c1e 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -927,7 +927,7 @@ copy_table_data(Relation NewHeap, Relation OldHeap, Relation OldIndex, bool verb * not to be aggressive about this. */ memset(¶ms, 0, sizeof(VacuumParams)); - vacuum_get_cutoffs(OldHeap, params, &cutoffs); + vacuum_get_cutoffs(OldHeap, params, &cutoffs, NULL, NULL); /* * FreezeXid will become the table's new relfrozenxid, and that mustn't go diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index bce3a2daa24..e85d53a0ecb 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1098,7 +1098,9 @@ get_all_vacuum_rels(MemoryContext vac_context, int options) */ bool vacuum_get_cutoffs(Relation rel, const VacuumParams params, - struct VacuumCutoffs *cutoffs) + struct VacuumCutoffs *cutoffs, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) { int freeze_min_age, multixact_freeze_min_age, @@ -1133,7 +1135,9 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams params, * that only one vacuum process can be working on a particular table at * any time, and that each vacuum is always an independent transaction. */ - cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + cutoffs->OldestXmin = GetOldestNonRemovableTransactionIdExt(rel, + slot_xmin, + slot_catalog_xmin); Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..286f0f46341 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin older + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1780,7 +1787,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId recentXid) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1835,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); + + if (TransactionIdIsValid(xmin)) + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + xmin, (int32) (recentXid - xmin), "max_slot_xid_age", max_slot_xid_age); + } + else + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + catalog_xmin, (int32) (recentXid - catalog_xmin), "max_slot_xid_age", max_slot_xid_age); + } + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1863,6 +1897,25 @@ CanInvalidateIdleSlot(ReplicationSlot *s) !(RecoveryInProgress() && s->data.synced)); } +/* + * Can we invalidate an XID-aged replication slot? + * + * XID-aged based invalidation is allowed to the given slot when: + * + * 1. Max XID-age is set + * 2. Slot has valid xmin or catalog_xmin + * 3. The slot is not being synced from the primary while the server is in + * recovery. + */ +static inline bool +CanInvalidateXidAgedSlot(ReplicationSlot *s) +{ + return (max_slot_xid_age != 0 && + (TransactionIdIsValid(s->data.xmin) || + TransactionIdIsValid(s->data.catalog_xmin)) && + !(RecoveryInProgress() && s->data.synced)); +} + /* * DetermineSlotInvalidationCause - Determine the cause for which a slot * becomes invalid among the given possible causes. @@ -1874,6 +1927,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId recentXid, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1945,6 +1999,22 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) + { + TransactionId xidLimit; + + Assert(TransactionIdIsValid(recentXid)); + + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); + + if ((TransactionIdIsValid(s->data.xmin) && + TransactionIdPrecedes(s->data.xmin, xidLimit)) || + (TransactionIdIsValid(s->data.catalog_xmin) && + TransactionIdPrecedes(s->data.catalog_xmin, xidLimit))) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1967,6 +2037,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId recentXid, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2019,6 +2090,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + recentXid, &inactive_since, now); @@ -2112,7 +2184,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, recentXid); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2238,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, recentXid); /* done with this slot for now */ break; @@ -2192,6 +2266,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2205,7 +2281,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, + TransactionId recentXid) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2244,7 +2321,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + recentXid, &released_lock)) { Assert(released_lock); @@ -3275,3 +3352,44 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Invalidate replication slots whose XID age exceeds the max_slot_xid_age + * GUC. + * + * The slot_xmin and slot_catalog_xmin are the replication slot xmin values + * obtained from the same ComputeXidHorizons() call that computed OldestXmin + * during vacuum. Using these avoids a separate ProcArrayLock acquisition. + * + * Returns true if at least one slot was invalidated. + */ +bool +MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, + TransactionId slot_catalog_xmin) +{ + TransactionId recentXid; + TransactionId xidLimit; + bool invalidated = false; + + if (max_slot_xid_age == 0) + return false; + + recentXid = ReadNextTransactionId(); + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); + + /* + * Invalidate possibly obsolete slots based on XID-age, if either slot's + * xmin or catalog_xmin is older than the cutoff. + */ + if ((TransactionIdIsValid(slot_xmin) && + TransactionIdPrecedes(slot_xmin, xidLimit)) || + (TransactionIdIsValid(slot_catalog_xmin) && + TransactionIdPrecedes(slot_catalog_xmin, xidLimit))) + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, + InvalidOid, + InvalidTransactionId, + recentXid); + + return invalidated; +} diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index cc207cb56e3..18683ce5aea 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1950,11 +1950,30 @@ GlobalVisHorizonKindForRel(Relation rel) */ TransactionId GetOldestNonRemovableTransactionId(Relation rel) +{ + return GetOldestNonRemovableTransactionIdExt(rel, NULL, NULL); +} + +/* + * Same as GetOldestNonRemovableTransactionId(), but also returns the + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() + * call. This avoids a separate ProcArrayLock acquisition when the caller + * needs both values. + */ +TransactionId +GetOldestNonRemovableTransactionIdExt(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) { ComputeXidHorizonsResult horizons; ComputeXidHorizons(&horizons); + if (slot_xmin) + *slot_xmin = horizons.slot_xmin; + if (slot_catalog_xmin) + *slot_catalog_xmin = horizons.slot_catalog_xmin; + switch (GlobalVisHorizonKindForRel(rel)) { case VISHORIZON_SHARED: diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index de9092fdf5b..d60f39ec08e 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -504,7 +504,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 0a862693fcd..ca3cc8417da 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2089,6 +2089,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index cf15597385b..055eba56bdf 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 1f45bca015c..c5ae9efe977 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -384,7 +384,9 @@ extern void vac_update_relstats(Relation relation, bool *minmulti_updated, bool in_outer_xact); extern bool vacuum_get_cutoffs(Relation rel, const VacuumParams params, - struct VacuumCutoffs *cutoffs); + struct VacuumCutoffs *cutoffs, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern bool vacuum_xid_failsafe_check(const struct VacuumCutoffs *cutoffs); extern void vac_update_datfrozenxid(void); extern void vacuum_delay_point(bool is_analyze); diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..0baa7112559 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -367,7 +370,10 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId recentXid); +extern bool MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, + TransactionId slot_catalog_xmin); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index abdf021e66e..c198fd22515 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -53,6 +53,9 @@ extern RunningTransactions GetRunningTransactionData(void); extern bool TransactionIdIsInProgress(TransactionId xid); extern TransactionId GetOldestNonRemovableTransactionId(Relation rel); +extern TransactionId GetOldestNonRemovableTransactionIdExt(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern TransactionId GetOldestTransactionIdConsideredRunning(void); extern TransactionId GetOldestActiveTransactionId(bool inCommitOnly, bool allDbs); diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..d0f58d8317f 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,179 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Advance XIDs, run VACUUM, and wait for a slot to be invalidated due to XID age. +sub invalidate_slot_by_xid_age +{ + my ($node, $table_name, $slot_name, $slot_type, $nxids, $trigger) = @_; + + # Do some work to advance xids + $node->safe_psql( + 'postgres', qq[ + do \$\$ + begin + for i in 1..$nxids loop + -- use an exception block so that each iteration eats an XID + begin + insert into $table_name values (i); + exception + when division_by_zero then null; + end; + end loop; + end\$\$; + ]); + + if ($trigger eq 'checkpoint') + { + $node->safe_psql('postgres', "CHECKPOINT"); + } + else + { + $node->safe_psql('postgres', "VACUUM"); + } + + # Wait for the replication slot to be invalidated due to XID age. + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) + or die + "Timed out while waiting for slot $slot_name to be invalidated"; + + ok(1, "$slot_type replication slot invalidated due to XID age (via $trigger)"); +} + +# ============================================================================= +# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Configure primary with XID age settings. Set autovacuum_naptime high so +# that the checkpointer (not vacuum) triggers the invalidation. +my $max_slot_xid_age = 500; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = '1h' +}); + +$primary5->start; + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create a standby linking to the primary using the replication slot +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); + +# Enable HS feedback. The slot should gain an xmin. We set the status interval +# so we'll see the results promptly. +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); + +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); + +# Wait until standby has replayed enough data +$primary5->wait_for_catchup($standby5); + +# Wait for the slot to get xmin from hot_standby_feedback +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin from HS feedback"; + +# Stop standby to make the replication slot on primary inactive. +# The slot's xmin persists and holds back datfrozenxid. +$standby5->stop; + +# Advance XIDs and wait for the slot to be invalidated due to XID age. +# Use 2x the max_slot_xid_age to ensure the slot's xmin age comfortably +# exceeds the configured limit. +invalidate_slot_by_xid_age($primary5, 'tab_int5', 'sb5_slot', 'physical', + 2 * $max_slot_xid_age, 'checkpoint'); + +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC (via checkpoint). +# ============================================================================= + +# ============================================================================= +# Testcase start: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC (via vacuum). + +# Reset autovacuum_naptime so that VACUUM-triggered invalidation works normally +$primary5->safe_psql('postgres', + "ALTER SYSTEM RESET autovacuum_naptime; SELECT pg_reload_conf();"); + +# Create a subscriber node +my $subscriber5 = PostgreSQL::Test::Cluster->new('subscriber5'); +$subscriber5->init(allows_streaming => 'logical'); +$subscriber5->start; + +# Create tables on both primary and subscriber +$primary5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); +$subscriber5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); + +# Insert some initial data +$primary5->safe_psql('postgres', + "INSERT INTO test_tbl5 VALUES (generate_series(1, 5));"); + +# Setup logical replication +my $primary5_connstr = $primary5->connstr . ' dbname=postgres'; +$primary5->safe_psql('postgres', + "CREATE PUBLICATION pub5 FOR TABLE test_tbl5"); + +$subscriber5->safe_psql('postgres', + "CREATE SUBSCRIPTION sub5 CONNECTION '$primary5_connstr' PUBLICATION pub5 WITH (slot_name = 'lsub5_slot')" +); + +# Wait for initial sync to complete +$subscriber5->wait_for_subscription_sync($primary5, 'sub5'); + +$result = $subscriber5->safe_psql('postgres', "SELECT count(*) FROM test_tbl5"); +is($result, qq(5), "check initial copy was done for logical replication (XID age test)"); + +# Wait for the logical slot to get catalog_xmin (logical slots use catalog_xmin, not xmin) +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NULL AND catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin to advance"; + +# Stop subscriber to make the replication slot on primary inactive +$subscriber5->stop; + +# Advance XIDs and wait for the slot to be invalidated due to XID age. +# Use 2x the max_slot_xid_age to ensure the slot's catalog_xmin age +# comfortably exceeds the configured limit. +invalidate_slot_by_xid_age($primary5, 'test_tbl5', 'lsub5_slot', 'logical', + 2 * $max_slot_xid_age, 'vacuum'); + +$primary5->stop; + +# Testcase end: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + done_testing(); -- 2.47.3 [application/x-patch] v5-0002-Add-more-tests-for-XID-age-slot-invalidation.patch (7.3K, 3-v5-0002-Add-more-tests-for-XID-age-slot-invalidation.patch) download | inline diff: From 0a67e1d75683d3007880b34ad8e02730a3075740 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Sat, 28 Mar 2026 08:49:45 +0000 Subject: [PATCH v5 2/2] Add more tests for XID age slot invalidation Consume XIDs up to wraparound WARNING limits with max_slot_xid_age matching vacuum_failsafe_age (1.6B). Verify that autovacuum invalidates the inactive replication slot (XID-age-based invalidation), unblocks datfrozenxid advancement, and prevents wraparound without any intervention. --- src/test/recovery/Makefile | 3 +- src/test/recovery/t/019_replslot_limit.pl | 162 ++++++++++++++++++++++ 2 files changed, 164 insertions(+), 1 deletion(-) diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile index d41aaaf8ae1..5c3d2c89941 100644 --- a/src/test/recovery/Makefile +++ b/src/test/recovery/Makefile @@ -12,7 +12,8 @@ EXTRA_INSTALL=contrib/pg_prewarm \ contrib/pg_stat_statements \ contrib/test_decoding \ - src/test/modules/injection_points + src/test/modules/injection_points \ + src/test/modules/xid_wraparound subdir = src/test/recovery top_builddir = ../../.. diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index d0f58d8317f..3a1ada0a02e 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -715,4 +715,166 @@ $primary5->stop; # GUC. # ============================================================================= +# ============================================================================= +# Testcase: XID-age-based slot invalidation in a production-like scenario. +# Standby sets slot xmin via HS feedback, disconnects, XIDs consumed. +# Autovacuum automatically invalidates the slot once its xmin age exceeds +# max_slot_xid_age, advances datfrozenxid in all databases, and keeps the +# system healthy — no manual VACUUM, vacuumdb, or downtime needed. + +# Check if autovacuum has invalidated the slot due to xid_aged. +# Returns 1 if invalidated, 0 otherwise. Early exit when max_slot_xid_age = 0. +sub check_slot_invalidated +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + return 0 if $max_age == 0; + + my $reason = $node->safe_psql('postgres', + "SELECT invalidation_reason FROM pg_replication_slots WHERE slot_name = '$slot_name'"); + if ($reason eq 'xid_aged') + { + diag "Slot invalidated by autovacuum after consuming $consumed_xids XIDs"; + return 1; + } + return 0; +} + +# Verify server log shows slot invalidation by autovacuum worker with +# correct xmin, age, and next txid values. +sub verify_slot_xid_aged_invalidation +{ + my ($node, $slot_name, $slot_xmin, $max_age, $consumed_xids) = @_; + + my $log = slurp_file($node->logfile); + + # Verify the invalidation was performed by an autovacuum worker. + like($log, + qr/autovacuum worker\[\d+\] LOG:\s+invalidating obsolete replication slot "$slot_name"/, + "server log: $slot_name invalidated by autovacuum worker"); + + # Verify DETAIL shows the correct xmin and max_slot_xid_age. + like($log, + qr/autovacuum worker\[\d+\] DETAIL:\s+The slot's xmin $slot_xmin is (\d+) transactions old, which exceeds the configured "max_slot_xid_age" value of $max_age\./, + "server log: DETAIL shows xmin $slot_xmin and age $max_age"); + + # Extract xid age from the log and report for diagnostics. + $log =~ + /The slot's xmin $slot_xmin is (\d+) transactions old/; + my $log_xid_age = $1 // 'N/A'; + diag "xid_age from server log=$log_xid_age, max_slot_xid_age=$max_age, consumed=$consumed_xids XIDs"; +} + +# Verify slot was invalidated and wait for autovacuum to advance datfrozenxid +# in all databases. Early exit when max_slot_xid_age = 0. +sub verify_invalidation_and_recovery +{ + my ($node, $slot_name, $slot_xmin, $max_age, $consumed_xids, $slot_gone) = @_; + + return if $max_age == 0; + + ok($slot_gone, 'autovacuum invalidated slot due to xid_aged'); + + verify_slot_xid_aged_invalidation($node, $slot_name, + $slot_xmin, $max_age, $consumed_xids); + + # Wait for autovacuum to advance datfrozenxid in all databases past the + # wraparound danger zone — no manual intervention required. + $node->poll_query_until( + 'postgres', qq[ + SELECT NOT EXISTS ( + SELECT 1 FROM pg_database + WHERE age(datfrozenxid) > 2000000000 + ); + ]) or die "Timed out waiting for autovacuum to advance datfrozenxid in all databases"; +} + +my $primary6 = PostgreSQL::Test::Cluster->new('primary6'); +$primary6->init(allows_streaming => 'logical'); + +$max_slot_xid_age = 1600000000; # matches vacuum_failsafe_age default +$primary6->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = 1s +}); + +$primary6->start; +$primary6->safe_psql('postgres', "CREATE EXTENSION xid_wraparound"); + +$backup_name = 'backup6'; +$primary6->backup($backup_name); + +my $standby6 = PostgreSQL::Test::Cluster->new('standby6'); +$standby6->init_from_backup($primary6, $backup_name, has_streaming => 1); +$standby6->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb6_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary6->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb6_slot', true)"); + +$standby6->start; + +$primary6->safe_psql('postgres', + "CREATE TABLE tab_int6 AS SELECT generate_series(1,10) AS a"); +$primary6->wait_for_catchup($standby6); + +$primary6->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL FROM pg_replication_slots + WHERE slot_name = 'sb6_slot'; +]) or die "Timed out waiting for sb6_slot xmin from HS feedback"; + +$result = $primary6->safe_psql('postgres', + "SELECT xmin IS NOT NULL FROM pg_replication_slots WHERE slot_name = 'sb6_slot'"); +is($result, 't', 'slot has xmin from hot_standby_feedback'); + +# Capture the slot's xmin for later log verification. +my $slot_xmin = $primary6->safe_psql('postgres', + "SELECT xmin FROM pg_replication_slots WHERE slot_name = 'sb6_slot'"); + +# Stop standby; slot xmin persists and holds back datfrozenxid. +$standby6->stop; + +# Consume XIDs in 50M chunks. Once we exceed max_slot_xid_age, autovacuum +# (naptime=1s) should automatically invalidate the slot. Keep consuming +# until we see that happen — no manual VACUUM or downtime needed. +my $logstart6 = -s $primary6->logfile; +my $chunk = 50_000_000; +my $max_xids = 2_200_000_000; +my $consumed = 0; +my $slot_gone = 0; + +while ($consumed < $max_xids) +{ + $primary6->safe_psql('postgres', "SELECT consume_xids($chunk)"); + $consumed += $chunk; + my $remaining = $max_xids - $consumed; + diag "Consumed $consumed / $max_xids XIDs ($remaining remaining)"; + + if (!$slot_gone && check_slot_invalidated($primary6, 'sb6_slot', + $max_slot_xid_age, $consumed)) + { + $slot_gone = 1; + } +} + +verify_invalidation_and_recovery($primary6, 'sb6_slot', + $slot_xmin, $max_slot_xid_age, $consumed, $slot_gone); + +# Consume 1B more XIDs — combining with the 2.2B consumed above, the total +# of 3.2B exceeds the 2^31 (~2.1B) usable XID space (xidStopLimit), i.e. +# more than one full wraparound cycle, proving the system is healthy. +$primary6->safe_psql('postgres', "SELECT consume_xids(1000000000)"); +ok(1, 'writes succeed after autovacuum invalidated the slot'); + +$primary6->stop; + +# Testcase end: XID-age-based slot invalidation in a production-like scenario. +# ============================================================================= + done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-29 20:16 Srinath Reddy Sadipiralla <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Srinath Reddy Sadipiralla @ 2026-03-29 20:16 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hello, Thanks for the v5 patch set, I have reviewed and did initial testing on v5 patch set, and it LGTM, except these diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 286f0f46341..c2ff7e464f0 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -1849,7 +1849,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, else { /* translator: %s is a GUC variable name */ - appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + appendStringInfo(&err_detail, _("The slot's catalog_xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), catalog_xmin, (int32) (recentXid - catalog_xmin), "max_slot_xid_age", max_slot_xid_age); } while testing the active slot XID age invalidation (SIGTERM path) , i observed that slot got invalidated , walsender was killed because of SIGTERM , then starts the infinite-retry-cycle problem where walreceiver starts walsender and walsender will try to use an invalidated slot and dies, will think more on this. -- Thanks, Srinath Reddy Sadipiralla EDB: https://www.enterprisedb.com/ ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-30 01:35 Bharath Rupireddy <[email protected]> parent: Srinath Reddy Sadipiralla <[email protected]> 0 siblings, 2 replies; 31+ messages in thread From: Bharath Rupireddy @ 2026-03-30 01:35 UTC (permalink / raw) To: Srinath Reddy Sadipiralla <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Sun, Mar 29, 2026 at 1:16 PM Srinath Reddy Sadipiralla <[email protected]> wrote: > > Hello, > > Thanks for the v5 patch set, I have reviewed and did initial testing on > v5 patch set, and it LGTM, except these Thank you for reviewing and testing. I appreciate it. > diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c > index 286f0f46341..c2ff7e464f0 100644 > --- a/src/backend/replication/slot.c > +++ b/src/backend/replication/slot.c > @@ -1849,7 +1849,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, > else > { > /* translator: %s is a GUC variable name */ > - appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), > + appendStringInfo(&err_detail, _("The slot's catalog_xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), > catalog_xmin, (int32) (recentXid - catalog_xmin), "max_slot_xid_age", max_slot_xid_age); > } Fixed the typo. > while testing the active slot XID age invalidation (SIGTERM path) , i > observed that slot got invalidated , walsender was killed because of > SIGTERM , then starts the infinite-retry-cycle problem where > walreceiver starts walsender and walsender will try to use an invalidated > slot and dies, will think more on this. I would like to clarify that once a slot is invalidated due to any of the reasons (ReplicationSlotInvalidationCause), it becomes unusable; the sender will error out if the receiver tries to use it. This is consistent with all existing slot invalidation mechanisms. Please find the attached v6 patches fixing the typo for further review. -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/octet-stream] v6-0002-Add-more-tests-for-XID-age-slot-invalidation.patch (7.3K, 2-v6-0002-Add-more-tests-for-XID-age-slot-invalidation.patch) download | inline diff: From 569f8ac68e9ebf37149f82ff9bb91b178969e303 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Mon, 30 Mar 2026 01:15:48 +0000 Subject: [PATCH v6 2/2] Add more tests for XID age slot invalidation Consume XIDs up to wraparound WARNING limits with max_slot_xid_age matching vacuum_failsafe_age (1.6B). Verify that autovacuum invalidates the inactive replication slot (XID-age-based invalidation), unblocks datfrozenxid advancement, and prevents wraparound without any intervention. --- src/test/recovery/Makefile | 3 +- src/test/recovery/t/019_replslot_limit.pl | 162 ++++++++++++++++++++++ 2 files changed, 164 insertions(+), 1 deletion(-) diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile index d41aaaf8ae1..5c3d2c89941 100644 --- a/src/test/recovery/Makefile +++ b/src/test/recovery/Makefile @@ -12,7 +12,8 @@ EXTRA_INSTALL=contrib/pg_prewarm \ contrib/pg_stat_statements \ contrib/test_decoding \ - src/test/modules/injection_points + src/test/modules/injection_points \ + src/test/modules/xid_wraparound subdir = src/test/recovery top_builddir = ../../.. diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index d0f58d8317f..3a1ada0a02e 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -715,4 +715,166 @@ $primary5->stop; # GUC. # ============================================================================= +# ============================================================================= +# Testcase: XID-age-based slot invalidation in a production-like scenario. +# Standby sets slot xmin via HS feedback, disconnects, XIDs consumed. +# Autovacuum automatically invalidates the slot once its xmin age exceeds +# max_slot_xid_age, advances datfrozenxid in all databases, and keeps the +# system healthy — no manual VACUUM, vacuumdb, or downtime needed. + +# Check if autovacuum has invalidated the slot due to xid_aged. +# Returns 1 if invalidated, 0 otherwise. Early exit when max_slot_xid_age = 0. +sub check_slot_invalidated +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + return 0 if $max_age == 0; + + my $reason = $node->safe_psql('postgres', + "SELECT invalidation_reason FROM pg_replication_slots WHERE slot_name = '$slot_name'"); + if ($reason eq 'xid_aged') + { + diag "Slot invalidated by autovacuum after consuming $consumed_xids XIDs"; + return 1; + } + return 0; +} + +# Verify server log shows slot invalidation by autovacuum worker with +# correct xmin, age, and next txid values. +sub verify_slot_xid_aged_invalidation +{ + my ($node, $slot_name, $slot_xmin, $max_age, $consumed_xids) = @_; + + my $log = slurp_file($node->logfile); + + # Verify the invalidation was performed by an autovacuum worker. + like($log, + qr/autovacuum worker\[\d+\] LOG:\s+invalidating obsolete replication slot "$slot_name"/, + "server log: $slot_name invalidated by autovacuum worker"); + + # Verify DETAIL shows the correct xmin and max_slot_xid_age. + like($log, + qr/autovacuum worker\[\d+\] DETAIL:\s+The slot's xmin $slot_xmin is (\d+) transactions old, which exceeds the configured "max_slot_xid_age" value of $max_age\./, + "server log: DETAIL shows xmin $slot_xmin and age $max_age"); + + # Extract xid age from the log and report for diagnostics. + $log =~ + /The slot's xmin $slot_xmin is (\d+) transactions old/; + my $log_xid_age = $1 // 'N/A'; + diag "xid_age from server log=$log_xid_age, max_slot_xid_age=$max_age, consumed=$consumed_xids XIDs"; +} + +# Verify slot was invalidated and wait for autovacuum to advance datfrozenxid +# in all databases. Early exit when max_slot_xid_age = 0. +sub verify_invalidation_and_recovery +{ + my ($node, $slot_name, $slot_xmin, $max_age, $consumed_xids, $slot_gone) = @_; + + return if $max_age == 0; + + ok($slot_gone, 'autovacuum invalidated slot due to xid_aged'); + + verify_slot_xid_aged_invalidation($node, $slot_name, + $slot_xmin, $max_age, $consumed_xids); + + # Wait for autovacuum to advance datfrozenxid in all databases past the + # wraparound danger zone — no manual intervention required. + $node->poll_query_until( + 'postgres', qq[ + SELECT NOT EXISTS ( + SELECT 1 FROM pg_database + WHERE age(datfrozenxid) > 2000000000 + ); + ]) or die "Timed out waiting for autovacuum to advance datfrozenxid in all databases"; +} + +my $primary6 = PostgreSQL::Test::Cluster->new('primary6'); +$primary6->init(allows_streaming => 'logical'); + +$max_slot_xid_age = 1600000000; # matches vacuum_failsafe_age default +$primary6->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = 1s +}); + +$primary6->start; +$primary6->safe_psql('postgres', "CREATE EXTENSION xid_wraparound"); + +$backup_name = 'backup6'; +$primary6->backup($backup_name); + +my $standby6 = PostgreSQL::Test::Cluster->new('standby6'); +$standby6->init_from_backup($primary6, $backup_name, has_streaming => 1); +$standby6->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb6_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary6->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb6_slot', true)"); + +$standby6->start; + +$primary6->safe_psql('postgres', + "CREATE TABLE tab_int6 AS SELECT generate_series(1,10) AS a"); +$primary6->wait_for_catchup($standby6); + +$primary6->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL FROM pg_replication_slots + WHERE slot_name = 'sb6_slot'; +]) or die "Timed out waiting for sb6_slot xmin from HS feedback"; + +$result = $primary6->safe_psql('postgres', + "SELECT xmin IS NOT NULL FROM pg_replication_slots WHERE slot_name = 'sb6_slot'"); +is($result, 't', 'slot has xmin from hot_standby_feedback'); + +# Capture the slot's xmin for later log verification. +my $slot_xmin = $primary6->safe_psql('postgres', + "SELECT xmin FROM pg_replication_slots WHERE slot_name = 'sb6_slot'"); + +# Stop standby; slot xmin persists and holds back datfrozenxid. +$standby6->stop; + +# Consume XIDs in 50M chunks. Once we exceed max_slot_xid_age, autovacuum +# (naptime=1s) should automatically invalidate the slot. Keep consuming +# until we see that happen — no manual VACUUM or downtime needed. +my $logstart6 = -s $primary6->logfile; +my $chunk = 50_000_000; +my $max_xids = 2_200_000_000; +my $consumed = 0; +my $slot_gone = 0; + +while ($consumed < $max_xids) +{ + $primary6->safe_psql('postgres', "SELECT consume_xids($chunk)"); + $consumed += $chunk; + my $remaining = $max_xids - $consumed; + diag "Consumed $consumed / $max_xids XIDs ($remaining remaining)"; + + if (!$slot_gone && check_slot_invalidated($primary6, 'sb6_slot', + $max_slot_xid_age, $consumed)) + { + $slot_gone = 1; + } +} + +verify_invalidation_and_recovery($primary6, 'sb6_slot', + $slot_xmin, $max_slot_xid_age, $consumed, $slot_gone); + +# Consume 1B more XIDs — combining with the 2.2B consumed above, the total +# of 3.2B exceeds the 2^31 (~2.1B) usable XID space (xidStopLimit), i.e. +# more than one full wraparound cycle, proving the system is healthy. +$primary6->safe_psql('postgres', "SELECT consume_xids(1000000000)"); +ok(1, 'writes succeed after autovacuum invalidated the slot'); + +$primary6->stop; + +# Testcase end: XID-age-based slot invalidation in a production-like scenario. +# ============================================================================= + done_testing(); -- 2.47.3 [application/octet-stream] v6-0001-Add-XID-age-based-replication-slot-invalidation.patch (30.9K, 3-v6-0001-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From 572dafbacfceeb7a334a860c9723958c4e2e0aaa Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Mon, 30 Mar 2026 01:12:56 +0000 Subject: [PATCH v6 1/2] Add XID age based replication slot invalidation Introduce max_slot_xid_age, a GUC that invalidates replication slots whose xmin or catalog_xmin exceeds the specified age. Disabled by default. Idle or forgotten replication slots can hold back vacuum, leading to bloat and eventually XID wraparound. In the worst case this requires dropping the slot and single-user mode vacuuming. This setting avoids that by proactively invalidating slots that have fallen too far behind. Invalidation checks are performed once per relation during vacuum (both vacuum command and autovacuum), and also by the checkpointer during checkpoints and restartpoints. --- doc/src/sgml/config.sgml | 40 ++++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/heap/vacuumlazy.c | 20 +- src/backend/access/transam/xlog.c | 20 +- src/backend/commands/cluster.c | 2 +- src/backend/commands/vacuum.c | 8 +- src/backend/replication/slot.c | 128 ++++++++++++- src/backend/storage/ipc/procarray.c | 19 ++ src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/commands/vacuum.h | 4 +- src/include/replication/slot.h | 10 +- src/include/storage/procarray.h | 3 + src/test/recovery/t/019_replslot_limit.pl | 175 ++++++++++++++++++ 15 files changed, 435 insertions(+), 15 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 229f41353eb..46aac59cb20 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <literal>xmin</literal> (the oldest + transaction that this slot needs the database to retain) or + <literal>catalog_xmin</literal> (the oldest transaction affecting the + system catalogs that this slot needs the database to retain) has reached + the age specified by this setting. This invalidation check happens + during vacuum (both <command>VACUUM</command> command and autovacuum) + and during checkpoints. + A value of zero (which is default) disables this feature. Users can set + this value anywhere from zero to two billion. This parameter can only be + set in the <filename>postgresql.conf</filename> file or on the server + command line. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + + <para> + Note that this invalidation mechanism is not applicable for slots + on the standby server that are being synced from the primary server + (i.e., standby slots having + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> + value <literal>true</literal>). Synced slots are always considered to + be inactive because they don't perform logical decoding to produce + changes. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index f698c2d899b..2e8e61eb0e2 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -147,6 +147,7 @@ #include "pgstat.h" #include "portability/instr_time.h" #include "postmaster/autovacuum.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/freespace.h" #include "storage/latch.h" @@ -642,6 +643,8 @@ heap_vacuum_rel(Relation rel, const VacuumParams params, ErrorContextCallback errcallback; char **indnames = NULL; Size dead_items_max_bytes = 0; + TransactionId slot_xmin = InvalidTransactionId; + TransactionId slot_catalog_xmin = InvalidTransactionId; verbose = (params.options & VACOPT_VERBOSE) != 0; instrument = (verbose || (AmAutoVacuumWorkerProcess() && @@ -798,7 +801,22 @@ heap_vacuum_rel(Relation rel, const VacuumParams params, * want to teach lazy_scan_prune to recompute vistest from time to time, * to increase the number of dead tuples it can prune away.) */ - vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs, + &slot_xmin, &slot_catalog_xmin); + + /* + * Try to invalidate XID-aged replication slots. Use the slot xmin values + * obtained from the same horizons computation that produced OldestXmin, + * avoiding an extra ProcArrayLock acquisition. + */ + if (MaybeInvalidateXIDAgedSlots(slot_xmin, slot_catalog_xmin)) + { + /* Recompute cutoffs after slot invalidation. */ + vacrel->aggressive = vacuum_get_cutoffs(rel, params, + &vacrel->cutoffs, + NULL, NULL); + } + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f5c9a34374d..cb943bdc2a9 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7019,6 +7019,7 @@ CreateCheckPoint(int flags) VirtualTransactionId *vxids; int nvxids; int oldXLogAllowed = 0; + uint32 possibleInvalidationCauses; /* * An end-of-recovery checkpoint is really a shutdown checkpoint, just @@ -7441,8 +7442,15 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | + RS_INVAL_XID_AGE; + + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, _logSegNo, InvalidOid, + InvalidTransactionId, + max_slot_xid_age > 0 ? + ReadNextTransactionId() : InvalidTransactionId)) { /* @@ -7724,6 +7732,7 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + uint32 possibleInvalidationCauses; /* Concurrent checkpoint/restartpoint cannot happen */ Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); @@ -7898,8 +7907,14 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | + RS_INVAL_XID_AGE; + + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, _logSegNo, InvalidOid, + InvalidTransactionId, + max_slot_xid_age > 0 ? + ReadNextTransactionId() : InvalidTransactionId)) { /* @@ -8764,6 +8779,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 09066db0956..118d5d28c1e 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -927,7 +927,7 @@ copy_table_data(Relation NewHeap, Relation OldHeap, Relation OldIndex, bool verb * not to be aggressive about this. */ memset(¶ms, 0, sizeof(VacuumParams)); - vacuum_get_cutoffs(OldHeap, params, &cutoffs); + vacuum_get_cutoffs(OldHeap, params, &cutoffs, NULL, NULL); /* * FreezeXid will become the table's new relfrozenxid, and that mustn't go diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index bce3a2daa24..e85d53a0ecb 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1098,7 +1098,9 @@ get_all_vacuum_rels(MemoryContext vac_context, int options) */ bool vacuum_get_cutoffs(Relation rel, const VacuumParams params, - struct VacuumCutoffs *cutoffs) + struct VacuumCutoffs *cutoffs, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) { int freeze_min_age, multixact_freeze_min_age, @@ -1133,7 +1135,9 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams params, * that only one vacuum process can be working on a particular table at * any time, and that each vacuum is always an independent transaction. */ - cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + cutoffs->OldestXmin = GetOldestNonRemovableTransactionIdExt(rel, + slot_xmin, + slot_catalog_xmin); Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..c2ff7e464f0 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin older + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1780,7 +1787,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId recentXid) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1835,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); + + if (TransactionIdIsValid(xmin)) + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + xmin, (int32) (recentXid - xmin), "max_slot_xid_age", max_slot_xid_age); + } + else + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's catalog_xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + catalog_xmin, (int32) (recentXid - catalog_xmin), "max_slot_xid_age", max_slot_xid_age); + } + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1863,6 +1897,25 @@ CanInvalidateIdleSlot(ReplicationSlot *s) !(RecoveryInProgress() && s->data.synced)); } +/* + * Can we invalidate an XID-aged replication slot? + * + * XID-aged based invalidation is allowed to the given slot when: + * + * 1. Max XID-age is set + * 2. Slot has valid xmin or catalog_xmin + * 3. The slot is not being synced from the primary while the server is in + * recovery. + */ +static inline bool +CanInvalidateXidAgedSlot(ReplicationSlot *s) +{ + return (max_slot_xid_age != 0 && + (TransactionIdIsValid(s->data.xmin) || + TransactionIdIsValid(s->data.catalog_xmin)) && + !(RecoveryInProgress() && s->data.synced)); +} + /* * DetermineSlotInvalidationCause - Determine the cause for which a slot * becomes invalid among the given possible causes. @@ -1874,6 +1927,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId recentXid, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1945,6 +1999,22 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) + { + TransactionId xidLimit; + + Assert(TransactionIdIsValid(recentXid)); + + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); + + if ((TransactionIdIsValid(s->data.xmin) && + TransactionIdPrecedes(s->data.xmin, xidLimit)) || + (TransactionIdIsValid(s->data.catalog_xmin) && + TransactionIdPrecedes(s->data.catalog_xmin, xidLimit))) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1967,6 +2037,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId recentXid, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2019,6 +2090,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + recentXid, &inactive_since, now); @@ -2112,7 +2184,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, recentXid); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2238,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, recentXid); /* done with this slot for now */ break; @@ -2192,6 +2266,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2205,7 +2281,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, + TransactionId recentXid) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2244,7 +2321,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + recentXid, &released_lock)) { Assert(released_lock); @@ -3275,3 +3352,44 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Invalidate replication slots whose XID age exceeds the max_slot_xid_age + * GUC. + * + * The slot_xmin and slot_catalog_xmin are the replication slot xmin values + * obtained from the same ComputeXidHorizons() call that computed OldestXmin + * during vacuum. Using these avoids a separate ProcArrayLock acquisition. + * + * Returns true if at least one slot was invalidated. + */ +bool +MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, + TransactionId slot_catalog_xmin) +{ + TransactionId recentXid; + TransactionId xidLimit; + bool invalidated = false; + + if (max_slot_xid_age == 0) + return false; + + recentXid = ReadNextTransactionId(); + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); + + /* + * Invalidate possibly obsolete slots based on XID-age, if either slot's + * xmin or catalog_xmin is older than the cutoff. + */ + if ((TransactionIdIsValid(slot_xmin) && + TransactionIdPrecedes(slot_xmin, xidLimit)) || + (TransactionIdIsValid(slot_catalog_xmin) && + TransactionIdPrecedes(slot_catalog_xmin, xidLimit))) + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, + InvalidOid, + InvalidTransactionId, + recentXid); + + return invalidated; +} diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index cc207cb56e3..18683ce5aea 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1950,11 +1950,30 @@ GlobalVisHorizonKindForRel(Relation rel) */ TransactionId GetOldestNonRemovableTransactionId(Relation rel) +{ + return GetOldestNonRemovableTransactionIdExt(rel, NULL, NULL); +} + +/* + * Same as GetOldestNonRemovableTransactionId(), but also returns the + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() + * call. This avoids a separate ProcArrayLock acquisition when the caller + * needs both values. + */ +TransactionId +GetOldestNonRemovableTransactionIdExt(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) { ComputeXidHorizonsResult horizons; ComputeXidHorizons(&horizons); + if (slot_xmin) + *slot_xmin = horizons.slot_xmin; + if (slot_catalog_xmin) + *slot_catalog_xmin = horizons.slot_catalog_xmin; + switch (GlobalVisHorizonKindForRel(rel)) { case VISHORIZON_SHARED: diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index de9092fdf5b..d60f39ec08e 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -504,7 +504,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 0a862693fcd..ca3cc8417da 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2089,6 +2089,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index cf15597385b..055eba56bdf 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 1f45bca015c..c5ae9efe977 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -384,7 +384,9 @@ extern void vac_update_relstats(Relation relation, bool *minmulti_updated, bool in_outer_xact); extern bool vacuum_get_cutoffs(Relation rel, const VacuumParams params, - struct VacuumCutoffs *cutoffs); + struct VacuumCutoffs *cutoffs, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern bool vacuum_xid_failsafe_check(const struct VacuumCutoffs *cutoffs); extern void vac_update_datfrozenxid(void); extern void vacuum_delay_point(bool is_analyze); diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..0baa7112559 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -367,7 +370,10 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId recentXid); +extern bool MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, + TransactionId slot_catalog_xmin); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index abdf021e66e..c198fd22515 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -53,6 +53,9 @@ extern RunningTransactions GetRunningTransactionData(void); extern bool TransactionIdIsInProgress(TransactionId xid); extern TransactionId GetOldestNonRemovableTransactionId(Relation rel); +extern TransactionId GetOldestNonRemovableTransactionIdExt(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern TransactionId GetOldestTransactionIdConsideredRunning(void); extern TransactionId GetOldestActiveTransactionId(bool inCommitOnly, bool allDbs); diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..d0f58d8317f 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,179 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Advance XIDs, run VACUUM, and wait for a slot to be invalidated due to XID age. +sub invalidate_slot_by_xid_age +{ + my ($node, $table_name, $slot_name, $slot_type, $nxids, $trigger) = @_; + + # Do some work to advance xids + $node->safe_psql( + 'postgres', qq[ + do \$\$ + begin + for i in 1..$nxids loop + -- use an exception block so that each iteration eats an XID + begin + insert into $table_name values (i); + exception + when division_by_zero then null; + end; + end loop; + end\$\$; + ]); + + if ($trigger eq 'checkpoint') + { + $node->safe_psql('postgres', "CHECKPOINT"); + } + else + { + $node->safe_psql('postgres', "VACUUM"); + } + + # Wait for the replication slot to be invalidated due to XID age. + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) + or die + "Timed out while waiting for slot $slot_name to be invalidated"; + + ok(1, "$slot_type replication slot invalidated due to XID age (via $trigger)"); +} + +# ============================================================================= +# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC. + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Configure primary with XID age settings. Set autovacuum_naptime high so +# that the checkpointer (not vacuum) triggers the invalidation. +my $max_slot_xid_age = 500; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = '1h' +}); + +$primary5->start; + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create a standby linking to the primary using the replication slot +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); + +# Enable HS feedback. The slot should gain an xmin. We set the status interval +# so we'll see the results promptly. +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); + +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); + +# Wait until standby has replayed enough data +$primary5->wait_for_catchup($standby5); + +# Wait for the slot to get xmin from hot_standby_feedback +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin from HS feedback"; + +# Stop standby to make the replication slot on primary inactive. +# The slot's xmin persists and holds back datfrozenxid. +$standby5->stop; + +# Advance XIDs and wait for the slot to be invalidated due to XID age. +# Use 2x the max_slot_xid_age to ensure the slot's xmin age comfortably +# exceeds the configured limit. +invalidate_slot_by_xid_age($primary5, 'tab_int5', 'sb5_slot', 'physical', + 2 * $max_slot_xid_age, 'checkpoint'); + +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC (via checkpoint). +# ============================================================================= + +# ============================================================================= +# Testcase start: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC (via vacuum). + +# Reset autovacuum_naptime so that VACUUM-triggered invalidation works normally +$primary5->safe_psql('postgres', + "ALTER SYSTEM RESET autovacuum_naptime; SELECT pg_reload_conf();"); + +# Create a subscriber node +my $subscriber5 = PostgreSQL::Test::Cluster->new('subscriber5'); +$subscriber5->init(allows_streaming => 'logical'); +$subscriber5->start; + +# Create tables on both primary and subscriber +$primary5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); +$subscriber5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); + +# Insert some initial data +$primary5->safe_psql('postgres', + "INSERT INTO test_tbl5 VALUES (generate_series(1, 5));"); + +# Setup logical replication +my $primary5_connstr = $primary5->connstr . ' dbname=postgres'; +$primary5->safe_psql('postgres', + "CREATE PUBLICATION pub5 FOR TABLE test_tbl5"); + +$subscriber5->safe_psql('postgres', + "CREATE SUBSCRIPTION sub5 CONNECTION '$primary5_connstr' PUBLICATION pub5 WITH (slot_name = 'lsub5_slot')" +); + +# Wait for initial sync to complete +$subscriber5->wait_for_subscription_sync($primary5, 'sub5'); + +$result = $subscriber5->safe_psql('postgres', "SELECT count(*) FROM test_tbl5"); +is($result, qq(5), "check initial copy was done for logical replication (XID age test)"); + +# Wait for the logical slot to get catalog_xmin (logical slots use catalog_xmin, not xmin) +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NULL AND catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin to advance"; + +# Stop subscriber to make the replication slot on primary inactive +$subscriber5->stop; + +# Advance XIDs and wait for the slot to be invalidated due to XID age. +# Use 2x the max_slot_xid_age to ensure the slot's catalog_xmin age +# comfortably exceeds the configured limit. +invalidate_slot_by_xid_age($primary5, 'test_tbl5', 'lsub5_slot', 'logical', + 2 * $max_slot_xid_age, 'vacuum'); + +$primary5->stop; + +# Testcase end: Invalidate logical subscriber's slot due to max_slot_xid_age +# GUC. +# ============================================================================= + done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-31 00:13 Masahiko Sawada <[email protected]> parent: Bharath Rupireddy <[email protected]> 1 sibling, 1 reply; 31+ messages in thread From: Masahiko Sawada @ 2026-03-31 00:13 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers On Sun, Mar 29, 2026 at 6:35 PM Bharath Rupireddy <[email protected]> wrote: > > Hi, > > On Sun, Mar 29, 2026 at 1:16 PM Srinath Reddy Sadipiralla > <[email protected]> wrote: > > > > Hello, > > > > Thanks for the v5 patch set, I have reviewed and did initial testing on > > v5 patch set, and it LGTM, except these > > Thank you for reviewing and testing. I appreciate it. > > > diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c > > index 286f0f46341..c2ff7e464f0 100644 > > --- a/src/backend/replication/slot.c > > +++ b/src/backend/replication/slot.c > > @@ -1849,7 +1849,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, > > else > > { > > /* translator: %s is a GUC variable name */ > > - appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), > > + appendStringInfo(&err_detail, _("The slot's catalog_xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), > > catalog_xmin, (int32) (recentXid - catalog_xmin), "max_slot_xid_age", max_slot_xid_age); > > } > > Fixed the typo. > > > while testing the active slot XID age invalidation (SIGTERM path) , i > > observed that slot got invalidated , walsender was killed because of > > SIGTERM , then starts the infinite-retry-cycle problem where > > walreceiver starts walsender and walsender will try to use an invalidated > > slot and dies, will think more on this. > > I would like to clarify that once a slot is invalidated due to any of > the reasons (ReplicationSlotInvalidationCause), it becomes unusable; > the sender will error out if the receiver tries to use it. This is > consistent with all existing slot invalidation mechanisms. > > Please find the attached v6 patches fixing the typo for further review. > I've reviewed the v6 patch. Here are some comments. bool vacuum_get_cutoffs(Relation rel, const VacuumParams params, - struct VacuumCutoffs *cutoffs) + struct VacuumCutoffs *cutoffs, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) How about storing both slot_xmin and catalog_xmin into VacuumCutoffs? --- - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | + RS_INVAL_XID_AGE; + + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, _logSegNo, InvalidOid, + InvalidTransactionId, + max_slot_xid_age > 0 ? + ReadNextTransactionId() : InvalidTransactionId)) It's odd to me that we specify RS_INVAL_XID_AGE while passing InvalidTransactionId. I think we can specify RS_INVAL_XID_AGE along with a valid recentXId only when we'd like to check the slots based on their XIDs. --- + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) + { + TransactionId xidLimit; + + Assert(TransactionIdIsValid(recentXid)); + + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); + I think we can avoid calculating xidLimit for every slot by calculating it in InvalidatePossiblyObsoleteSlot() and passing it to DetermineSlotInvalidationCause(). --- */ TransactionId GetOldestNonRemovableTransactionId(Relation rel) +{ + return GetOldestNonRemovableTransactionIdExt(rel, NULL, NULL); +} + +/* + * Same as GetOldestNonRemovableTransactionId(), but also returns the + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() + * call. This avoids a separate ProcArrayLock acquisition when the caller + * needs both values. + */ +TransactionId +GetOldestNonRemovableTransactionIdExt(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) { I understand that the primary reason why the patch introduces another variant of GetOldestNonRemovableTransactionId() is to avoid extra ProcArrayLock acquision to get replication slot xmin and catalog_xmin. While it's not very elegant, I find that it would not be bad because otherwise autovacuum takes extra ProcArrayLock (in shared mode) for every table to vacuum. The ProcArrayLock is already known high-contented lock it would be better to avoid taking it once more. If others think differently, we can just call ProcArrayGetReplicationSlotXmin() separately and compare them to the limit of XID-age based slot invalidation. Having said that, I personally don't want to add new instructions to the existing GetOldestNonRemovableTransactionId(). I guess we might want to make both the existing function and new function call a common (inline) function that takes ComputeXidHorizonsResult and returns appropriate transaction id based on the given relation . --- + # Do some work to advance xids + $node->safe_psql( + 'postgres', qq[ + do \$\$ + begin + for i in 1..$nxids loop + -- use an exception block so that each iteration eats an XID + begin + insert into $table_name values (i); + exception + when division_by_zero then null; + end; + end loop; + end\$\$; + ]); I think it's fater to use pg_current_xact_id() instead. --- + else + { + $node->safe_psql('postgres', "VACUUM"); + } We don't need to vacuum all tables here. --- +# Configure primary with XID age settings. Set autovacuum_naptime high so +# that the checkpointer (not vacuum) triggers the invalidation. +my $max_slot_xid_age = 500; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = '1h' +}); I think that it's better to disable autovacuum than setting a large number. --- +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age +# GUC (via checkpoint). I think that we can say "physical slot" instead of standby's slot to avoid confusion as I thought standby's slot is a slot created on the standby at the first glance. --- Do we have tests for invalidating slots on the standbys? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* RE: Introduce XID age based replication slot invalidation @ 2026-03-31 07:25 Hayato Kuroda (Fujitsu) <[email protected]> parent: Bharath Rupireddy <[email protected]> 1 sibling, 1 reply; 31+ messages in thread From: Hayato Kuroda (Fujitsu) @ 2026-03-31 07:25 UTC (permalink / raw) To: 'Bharath Rupireddy' <[email protected]>; Srinath Reddy Sadipiralla <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; John H <[email protected]>; pgsql-hackers Dear Bharath, Thanks for re-working the project. While seeing the old discussion, I found that Robert Haas was agaist the XID-based invalidation, because it's difficult to determine the cutoff age [1]. Can you clarify your thought against the point? Are you focusing on solving the wraparound issues, not for bloated instance issue? The code may not be accepted unless we got his agreement. [1]: https://www.postgresql.org/message-id/[email protected]... Best regards, Hayato Kuroda FUJITSU LIMITED ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-31 16:45 Bharath Rupireddy <[email protected]> parent: Hayato Kuroda (Fujitsu) <[email protected]> 0 siblings, 0 replies; 31+ messages in thread From: Bharath Rupireddy @ 2026-03-31 16:45 UTC (permalink / raw) To: Hayato Kuroda (Fujitsu) <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; Masahiko Sawada <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Tue, Mar 31, 2026 at 12:25 AM Hayato Kuroda (Fujitsu) <[email protected]> wrote: > > Dear Bharath, > > Thanks for re-working the project. Thank you for looking into this. > While seeing the old discussion, I found that Robert Haas was agaist the XID-based > invalidation, because it's difficult to determine the cutoff age [1]. > Can you clarify your thought against the point? Are you focusing on solving the > wraparound issues, not for bloated instance issue? > The code may not be accepted unless we got his agreement. > > [1]: https://www.postgresql.org/message-id/[email protected]... I summarized what others (Nathan, Robert, Amit, Alvaro, Bertrand) said about it here with my responses: https://www.postgresql.org/message-id/CALj2ACVY%2BFd5vC0VjW%3D5VDK9mmt-Y%2BPDZxnBp8ngGAZc24Vv9g%40ma.... Please have a look. A good setting for this in production scenarios is to set max_slot_xid_age to vacuum_failsafe_age (1.6B) or little less, so that autovacuum invalidates the slot before entering failsafe mode, unblocking datfrozenxid advancement and avoiding XID wraparound without manual VACUUM or downtime. I added a test for this in the 0002 patch. Please have a look. -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-03-31 17:20 Bharath Rupireddy <[email protected]> parent: Masahiko Sawada <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-03-31 17:20 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Mon, Mar 30, 2026 at 5:13 PM Masahiko Sawada <[email protected]> wrote: > > I've reviewed the v6 patch. Here are some comments. Thank you for reviewing the patch. > bool > vacuum_get_cutoffs(Relation rel, const VacuumParams params, > - struct VacuumCutoffs *cutoffs) > + struct VacuumCutoffs *cutoffs, > + TransactionId *slot_xmin, > + TransactionId *slot_catalog_xmin) > > How about storing both slot_xmin and catalog_xmin into VacuumCutoffs? Done. > --- > - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | > RS_INVAL_IDLE_TIMEOUT, > + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | > + RS_INVAL_XID_AGE; > + > + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, > _logSegNo, InvalidOid, > + InvalidTransactionId, > + max_slot_xid_age > 0 ? > + ReadNextTransactionId() : > InvalidTransactionId)) > > It's odd to me that we specify RS_INVAL_XID_AGE while passing > InvalidTransactionId. I think we can specify RS_INVAL_XID_AGE along > with a valid recentXId only when we'd like to check the slots based on > their XIDs. Done. > --- > + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ > + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) > + { > + TransactionId xidLimit; > + > + Assert(TransactionIdIsValid(recentXid)); > + > + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); > + > > I think we can avoid calculating xidLimit for every slot by > calculating it in InvalidatePossiblyObsoleteSlot() and passing it to > DetermineSlotInvalidationCause(). Done. > --- > */ > TransactionId > GetOldestNonRemovableTransactionId(Relation rel) > +{ > + return GetOldestNonRemovableTransactionIdExt(rel, NULL, NULL); > +} > + > +/* > + * Same as GetOldestNonRemovableTransactionId(), but also returns the > + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() > + * call. This avoids a separate ProcArrayLock acquisition when the caller > + * needs both values. > + */ > +TransactionId > +GetOldestNonRemovableTransactionIdExt(Relation rel, > + TransactionId *slot_xmin, > + TransactionId *slot_catalog_xmin) > { > > I understand that the primary reason why the patch introduces another > variant of GetOldestNonRemovableTransactionId() is to avoid extra > ProcArrayLock acquision to get replication slot xmin and catalog_xmin. > While it's not very elegant, I find that it would not be bad because > otherwise autovacuum takes extra ProcArrayLock (in shared mode) for > every table to vacuum. The ProcArrayLock is already known > high-contented lock it would be better to avoid taking it once more. > If others think differently, we can just call > ProcArrayGetReplicationSlotXmin() separately and compare them to the > limit of XID-age based slot invalidation. I understand the concerns around the ProcArrayLock and I think a new function to return the computed slot's xmin and catalog_xmin is good. > Having said that, I personally don't want to add new instructions to > the existing GetOldestNonRemovableTransactionId(). I guess we might > want to make both the existing function and new function call a common > (inline) function that takes ComputeXidHorizonsResult and returns > appropriate transaction id based on the given relation . Done. > --- > + # Do some work to advance xids > + $node->safe_psql( > + 'postgres', qq[ > + do \$\$ > + begin > + for i in 1..$nxids loop > + -- use an exception block so that each iteration eats an XID > + begin > + insert into $table_name values (i); > + exception > + when division_by_zero then null; > + end; > + end loop; > + end\$\$; > + ]); > > I think it's fater to use pg_current_xact_id() instead. Done. I pulled this from an existing test case in 001_stream_rep.pl. Used the pg_current_xact_id approach. Testing times stay the same i.e. 9 wallclock secs. > --- > + else > + { > + $node->safe_psql('postgres', "VACUUM"); > + } > > We don't need to vacuum all tables here. Fixed. > --- > +# Configure primary with XID age settings. Set autovacuum_naptime high so > +# that the checkpointer (not vacuum) triggers the invalidation. > +my $max_slot_xid_age = 500; > +$primary5->append_conf( > + 'postgresql.conf', qq{ > +max_slot_xid_age = $max_slot_xid_age > +autovacuum_naptime = '1h' > +}); > > I think that it's better to disable autovacuum than setting a large number. Done. > --- > +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age > +# GUC (via checkpoint). > > I think that we can say "physical slot" instead of standby's slot to > avoid confusion as I thought standby's slot is a slot created on the > standby at the first glance. Fixed. > --- > Do we have tests for invalidating slots on the standbys? Added a test case for this. Please find the attached v7 patches for further review. Thank you! -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/x-patch] v7-0001-Add-XID-age-based-replication-slot-invalidation.patch (31.7K, 2-v7-0001-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From af66f78a12d15ed26356eaead75dfac3ccd56819 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Tue, 31 Mar 2026 16:09:15 +0000 Subject: [PATCH v7 1/2] Add XID age based replication slot invalidation Introduce max_slot_xid_age, a GUC that invalidates replication slots whose xmin or catalog_xmin exceeds the specified age. Disabled by default. Idle or forgotten replication slots can hold back vacuum, leading to bloat and eventually XID wraparound. In the worst case this requires dropping the slot and single-user mode vacuuming. This setting avoids that by proactively invalidating slots that have fallen too far behind. Invalidation checks are performed once per relation during vacuum (both vacuum command and autovacuum), and also by the checkpointer during checkpoints and restartpoints. --- doc/src/sgml/config.sgml | 40 ++++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/heap/vacuumlazy.c | 15 ++ src/backend/access/transam/xlog.c | 34 +++- src/backend/commands/vacuum.c | 4 +- src/backend/replication/slot.c | 130 ++++++++++++- src/backend/storage/ipc/procarray.c | 60 ++++-- src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/commands/vacuum.h | 9 + src/include/replication/slot.h | 10 +- src/include/storage/procarray.h | 3 + src/test/recovery/t/019_replslot_limit.pl | 178 ++++++++++++++++++ 14 files changed, 478 insertions(+), 26 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 229f41353eb..46aac59cb20 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <literal>xmin</literal> (the oldest + transaction that this slot needs the database to retain) or + <literal>catalog_xmin</literal> (the oldest transaction affecting the + system catalogs that this slot needs the database to retain) has reached + the age specified by this setting. This invalidation check happens + during vacuum (both <command>VACUUM</command> command and autovacuum) + and during checkpoints. + A value of zero (which is default) disables this feature. Users can set + this value anywhere from zero to two billion. This parameter can only be + set in the <filename>postgresql.conf</filename> file or on the server + command line. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + + <para> + Note that this invalidation mechanism is not applicable for slots + on the standby server that are being synced from the primary server + (i.e., standby slots having + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> + value <literal>true</literal>). Synced slots are always considered to + be inactive because they don't perform logical decoding to produce + changes. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 24001b27387..98822823d2e 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -147,6 +147,7 @@ #include "pgstat.h" #include "portability/instr_time.h" #include "postmaster/autovacuum.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/freespace.h" #include "storage/latch.h" @@ -799,6 +800,20 @@ heap_vacuum_rel(Relation rel, const VacuumParams params, * to increase the number of dead tuples it can prune away.) */ vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + + /* + * Try to invalidate XID-aged replication slots. Use the slot xmin values + * obtained from the same horizons computation that produced OldestXmin, + * avoiding an extra ProcArrayLock acquisition. + */ + if (MaybeInvalidateXIDAgedSlots(vacrel->cutoffs.slot_xmin, + vacrel->cutoffs.slot_catalog_xmin)) + { + /* Recompute cutoffs after slot invalidation */ + vacrel->aggressive = vacuum_get_cutoffs(rel, params, + &vacrel->cutoffs); + } + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 2c1c6f88b74..70c1d5c5559 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7019,6 +7019,8 @@ CreateCheckPoint(int flags) VirtualTransactionId *vxids; int nvxids; int oldXLogAllowed = 0; + uint32 possibleInvalidationCauses; + TransactionId recentXid; /* * An end-of-recovery checkpoint is really a shutdown checkpoint, just @@ -7447,9 +7449,20 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + recentXid = InvalidTransactionId; + + if (max_slot_xid_age > 0) + { + possibleInvalidationCauses |= RS_INVAL_XID_AGE; + recentXid = ReadNextTransactionId(); + } + + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + recentXid)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -7730,6 +7743,8 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + uint32 possibleInvalidationCauses; + TransactionId recentXid; /* Concurrent checkpoint/restartpoint cannot happen */ Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); @@ -7904,9 +7919,19 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + recentXid = InvalidTransactionId; + + if (max_slot_xid_age > 0) + { + possibleInvalidationCauses |= RS_INVAL_XID_AGE; + recentXid = ReadNextTransactionId(); + } + + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + recentXid)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -8770,6 +8795,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 766a518c7a1..d78a05a4978 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1133,7 +1133,9 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams params, * that only one vacuum process can be working on a particular table at * any time, and that each vacuum is always an independent transaction. */ - cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + cutoffs->OldestXmin = GetOldestNonRemovableTransactionIdExt(rel, + &cutoffs->slot_xmin, + &cutoffs->slot_catalog_xmin); Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..20729d2fb42 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin older + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1780,7 +1787,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId recentXid) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1835,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); + + if (TransactionIdIsValid(xmin)) + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + xmin, (int32) (recentXid - xmin), "max_slot_xid_age", max_slot_xid_age); + } + else + { + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("The slot's catalog_xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), + catalog_xmin, (int32) (recentXid - catalog_xmin), "max_slot_xid_age", max_slot_xid_age); + } + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1863,6 +1897,25 @@ CanInvalidateIdleSlot(ReplicationSlot *s) !(RecoveryInProgress() && s->data.synced)); } +/* + * Can we invalidate an XID-aged replication slot? + * + * XID-aged based invalidation is allowed to the given slot when: + * + * 1. Max XID-age is set + * 2. Slot has valid xmin or catalog_xmin + * 3. The slot is not being synced from the primary while the server is in + * recovery. + */ +static inline bool +CanInvalidateXidAgedSlot(ReplicationSlot *s) +{ + return (max_slot_xid_age != 0 && + (TransactionIdIsValid(s->data.xmin) || + TransactionIdIsValid(s->data.catalog_xmin)) && + !(RecoveryInProgress() && s->data.synced)); +} + /* * DetermineSlotInvalidationCause - Determine the cause for which a slot * becomes invalid among the given possible causes. @@ -1874,6 +1927,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1945,6 +1999,18 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) + { + Assert(TransactionIdIsValid(xidLimit)); + + if ((TransactionIdIsValid(s->data.xmin) && + TransactionIdPrecedes(s->data.xmin, xidLimit)) || + (TransactionIdIsValid(s->data.catalog_xmin) && + TransactionIdPrecedes(s->data.catalog_xmin, xidLimit))) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1967,12 +2033,19 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId recentXid, bool *released_lock_out) { int last_signaled_pid = 0; bool released_lock = false; bool invalidated = false; TimestampTz inactive_since = 0; + TransactionId xidLimit = InvalidTransactionId; + + /* Compute the XID limit once, to avoid redundant work per slot */ + if ((possible_causes & RS_INVAL_XID_AGE) && + TransactionIdIsValid(recentXid)) + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); for (;;) { @@ -2019,6 +2092,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + xidLimit, &inactive_since, now); @@ -2112,7 +2186,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, recentXid); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2240,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, recentXid); /* done with this slot for now */ break; @@ -2192,6 +2268,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2205,7 +2283,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, + TransactionId recentXid) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2244,7 +2323,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + recentXid, &released_lock)) { Assert(released_lock); @@ -3275,3 +3354,44 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Invalidate replication slots whose XID age exceeds the max_slot_xid_age + * GUC. + * + * The slot_xmin and slot_catalog_xmin are the replication slot xmin values + * obtained from the same ComputeXidHorizons() call that computed OldestXmin + * during vacuum. Using these avoids a separate ProcArrayLock acquisition. + * + * Returns true if at least one slot was invalidated. + */ +bool +MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, + TransactionId slot_catalog_xmin) +{ + TransactionId recentXid; + TransactionId xidLimit; + bool invalidated = false; + + if (max_slot_xid_age == 0) + return false; + + recentXid = ReadNextTransactionId(); + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); + + /* + * Invalidate possibly obsolete slots based on XID-age, if either slot's + * xmin or catalog_xmin is older than the cutoff. + */ + if ((TransactionIdIsValid(slot_xmin) && + TransactionIdPrecedes(slot_xmin, xidLimit)) || + (TransactionIdIsValid(slot_catalog_xmin) && + TransactionIdPrecedes(slot_catalog_xmin, xidLimit))) + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, + InvalidOid, + InvalidTransactionId, + recentXid); + + return invalidated; +} diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index cc207cb56e3..9e0acf7309d 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1937,6 +1937,30 @@ GlobalVisHorizonKindForRel(Relation rel) return VISHORIZON_TEMP; } +/* + * Helper to return the appropriate oldest non-removable TransactionId from + * pre-computed horizons, based on the relation type. + */ +static inline TransactionId +GetOldestNonRemovableTransactionIdFromHorizons(ComputeXidHorizonsResult *horizons, + Relation rel) +{ + switch (GlobalVisHorizonKindForRel(rel)) + { + case VISHORIZON_SHARED: + return horizons->shared_oldest_nonremovable; + case VISHORIZON_CATALOG: + return horizons->catalog_oldest_nonremovable; + case VISHORIZON_DATA: + return horizons->data_oldest_nonremovable; + case VISHORIZON_TEMP: + return horizons->temp_oldest_nonremovable; + } + + /* just to prevent compiler warnings */ + return InvalidTransactionId; +} + /* * Return the oldest XID for which deleted tuples must be preserved in the * passed table. @@ -1955,20 +1979,30 @@ GetOldestNonRemovableTransactionId(Relation rel) ComputeXidHorizons(&horizons); - switch (GlobalVisHorizonKindForRel(rel)) - { - case VISHORIZON_SHARED: - return horizons.shared_oldest_nonremovable; - case VISHORIZON_CATALOG: - return horizons.catalog_oldest_nonremovable; - case VISHORIZON_DATA: - return horizons.data_oldest_nonremovable; - case VISHORIZON_TEMP: - return horizons.temp_oldest_nonremovable; - } + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); +} - /* just to prevent compiler warnings */ - return InvalidTransactionId; +/* + * Same as GetOldestNonRemovableTransactionId(), but also returns the + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() + * call. This avoids a separate ProcArrayLock acquisition when the caller + * needs both values. + */ +TransactionId +GetOldestNonRemovableTransactionIdExt(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) +{ + ComputeXidHorizonsResult horizons; + + ComputeXidHorizons(&horizons); + + if (slot_xmin) + *slot_xmin = horizons.slot_xmin; + if (slot_catalog_xmin) + *slot_catalog_xmin = horizons.slot_catalog_xmin; + + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); } /* diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index de9092fdf5b..d60f39ec08e 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -504,7 +504,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 0a862693fcd..ca3cc8417da 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2089,6 +2089,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index cf15597385b..055eba56bdf 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 5d351f0df33..38e8e05567f 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -287,6 +287,15 @@ struct VacuumCutoffs */ TransactionId FreezeLimit; MultiXactId MultiXactCutoff; + + /* + * Replication slot xmin and catalog_xmin values obtained from the same + * ComputeXidHorizons() call that computed OldestXmin. These are used for + * XID-age-based replication slot invalidation without requiring an extra + * ProcArrayLock acquisition. + */ + TransactionId slot_xmin; + TransactionId slot_catalog_xmin; }; /* diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..0baa7112559 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -367,7 +370,10 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId recentXid); +extern bool MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, + TransactionId slot_catalog_xmin); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index abdf021e66e..c198fd22515 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -53,6 +53,9 @@ extern RunningTransactions GetRunningTransactionData(void); extern bool TransactionIdIsInProgress(TransactionId xid); extern TransactionId GetOldestNonRemovableTransactionId(Relation rel); +extern TransactionId GetOldestNonRemovableTransactionIdExt(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern TransactionId GetOldestTransactionIdConsideredRunning(void); extern TransactionId GetOldestActiveTransactionId(bool inCommitOnly, bool allDbs); diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..e82bef93879 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,182 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Advance the given number of XIDs +sub advance_xids +{ + my ($node, $nxids) = @_; + my $sql = join(";\n", ("SELECT pg_current_xact_id()") x $nxids); + $node->safe_psql('postgres', $sql); +} + +# Wait for the given slot to be invalidated with reason 'xid_aged' +sub wait_for_xid_aged_invalidation +{ + my ($node, $slot_name) = @_; + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) or die "Timed out waiting for slot $slot_name to be invalidated"; +} + +# ===================================================================== +# Testcase start: Invalidate physical slot due to max_slot_xid_age GUC + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Disable autovacuum so checkpointer triggers the invalidation +my $max_slot_xid_age = 100; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); + +$primary5->start; + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create standby with HS feedback so the slot gains an xmin +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); +$primary5->wait_for_catchup($standby5); + +# Wait for the physical slot to get xmin via hot_standby_feedback +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin from HS feedback"; + +# Stop standby so the slot becomes inactive with its xmin frozen +$standby5->stop; + +# Advance XIDs past 2x max_slot_xid_age so the slot's xmin is stale enough +advance_xids($primary5, 2 * $max_slot_xid_age); +$primary5->safe_psql('postgres', "CHECKPOINT"); +wait_for_xid_aged_invalidation($primary5, 'sb5_slot'); +ok(1, "physical slot invalidated due to XID age (via checkpoint)"); + +# Testcase end: Invalidate physical slot due to max_slot_xid_age GUC +# =================================================================== + +# ==================================================================== +# Testcase start: Invalidate logical slot due to max_slot_xid_age GUC + +# Re-enable autovacuum so that VACUUM-triggered invalidation works normally +$primary5->safe_psql('postgres', + "ALTER SYSTEM SET autovacuum = on; SELECT pg_reload_conf();"); + +# Create a subscriber node +my $subscriber5 = PostgreSQL::Test::Cluster->new('subscriber5'); +$subscriber5->init(allows_streaming => 'logical'); +$subscriber5->start; + +# Create tables on both primary and subscriber +$primary5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); +$subscriber5->safe_psql('postgres', "CREATE TABLE test_tbl5 (id int)"); +$primary5->safe_psql('postgres', + "INSERT INTO test_tbl5 VALUES (generate_series(1, 5));"); + +# Setup logical replication +my $primary5_connstr = $primary5->connstr . ' dbname=postgres'; +$primary5->safe_psql('postgres', + "CREATE PUBLICATION pub5 FOR TABLE test_tbl5"); +$subscriber5->safe_psql('postgres', + "CREATE SUBSCRIPTION sub5 CONNECTION '$primary5_connstr' PUBLICATION pub5 WITH (slot_name = 'lsub5_slot')" +); + +# Wait for initial sync +$subscriber5->wait_for_subscription_sync($primary5, 'sub5'); + +$result = $subscriber5->safe_psql('postgres', "SELECT count(*) FROM test_tbl5"); +is($result, qq(5), "check initial copy was done for logical replication (XID age test)"); + +# Wait for the logical slot to get catalog_xmin +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NULL AND catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin to advance"; + +# Stop subscriber to make the slot inactive +$subscriber5->stop; + +# Advance XIDs past 2x max_slot_xid_age so the slot's catalog_xmin is stale enough +advance_xids($primary5, 2 * $max_slot_xid_age); +$primary5->safe_psql('postgres', "VACUUM test_tbl5"); +wait_for_xid_aged_invalidation($primary5, 'lsub5_slot'); +ok(1, "logical slot invalidated due to XID age (via vacuum)"); + +# Testcase end: Invalidate logical slot due to max_slot_xid_age GUC +# ================================================================== + +# =============================================================================== +# Testcase start: Invalidate logical slot on standby due to max_slot_xid_age GUC + +# Disable max_slot_xid_age on primary and recreate the streaming slot +$primary5->safe_psql('postgres', + "ALTER SYSTEM SET max_slot_xid_age = 0; SELECT pg_reload_conf();"); +$primary5->safe_psql('postgres', + "SELECT pg_drop_replication_slot('sb5_slot')"); +$primary5->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb5_slot', true)"); +$standby5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); +$standby5->start; + +$primary5->wait_for_catchup($standby5); + +$standby5->create_logical_slot_on_standby($primary5, 'sb5_logical_slot', + 'postgres'); + +$standby5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_logical_slot'; +]) or die "Timed out waiting for sb5_logical_slot catalog_xmin"; + +# Advance XIDs on primary, replay on standby, then restartpoint to invalidate +advance_xids($primary5, 2 * $max_slot_xid_age); +$primary5->safe_psql('postgres', "CHECKPOINT"); +$primary5->wait_for_catchup($standby5); +$standby5->safe_psql('postgres', "CHECKPOINT"); + +wait_for_xid_aged_invalidation($standby5, 'sb5_logical_slot'); +ok(1, "logical (standby) slot invalidated due to XID age (via restartpoint)"); + +$standby5->stop; +$primary5->stop; + +# Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC +# ============================================================================= + done_testing(); -- 2.47.3 [application/x-patch] v7-0002-Add-more-tests-for-XID-age-slot-invalidation.patch (6.4K, 3-v7-0002-Add-more-tests-for-XID-age-slot-invalidation.patch) download | inline diff: From e7739c4b57c49c9dac14633831b11fa3c4f18181 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Tue, 31 Mar 2026 16:09:32 +0000 Subject: [PATCH v7 2/2] Add more tests for XID age slot invalidation Consume XIDs up to wraparound WARNING limits with max_slot_xid_age matching vacuum_failsafe_age (1.6B). Verify that autovacuum invalidates the inactive replication slot (XID-age-based invalidation), unblocks datfrozenxid advancement, and prevents wraparound without any intervention. --- src/test/recovery/Makefile | 3 +- src/test/recovery/t/019_replslot_limit.pl | 133 ++++++++++++++++++++++ 2 files changed, 135 insertions(+), 1 deletion(-) diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile index d41aaaf8ae1..5c3d2c89941 100644 --- a/src/test/recovery/Makefile +++ b/src/test/recovery/Makefile @@ -12,7 +12,8 @@ EXTRA_INSTALL=contrib/pg_prewarm \ contrib/pg_stat_statements \ contrib/test_decoding \ - src/test/modules/injection_points + src/test/modules/injection_points \ + src/test/modules/xid_wraparound subdir = src/test/recovery top_builddir = ../../.. diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index e82bef93879..63e87443edd 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -718,4 +718,137 @@ $primary5->stop; # Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC # ============================================================================= +# ================================================================================= +# Testcase start: XID-age-based slot invalidation with autovacuum (production-like) + +# Standby sets slot xmin via HS feedback, disconnects, XIDs are consumed. +# max_slot_xid_age is set to vacuum_failsafe_age (1.6B) so autovacuum +# invalidates the slot before entering failsafe mode, unblocking +# datfrozenxid advancement and avoiding XID wraparound without manual +# VACUUM or downtime. + +# Verify server log shows slot invalidation by autovacuum worker +sub verify_slot_xid_aged_invalidation_in_server_log +{ + my ($node, $slot_name, $slot_xmin, $max_age, $consumed_xids) = @_; + + my $log = slurp_file($node->logfile); + + # Verify the invalidation was performed by an autovacuum worker + like($log, + qr/autovacuum worker\[\d+\] LOG:\s+invalidating obsolete replication slot "$slot_name"/, + "server log: $slot_name invalidated by autovacuum worker"); + + # Verify DETAIL shows the correct xmin and max_slot_xid_age + like($log, + qr/autovacuum worker\[\d+\] DETAIL:\s+The slot's xmin $slot_xmin is (\d+) transactions old, which exceeds the configured "max_slot_xid_age" value of $max_age\./, + "server log: DETAIL shows xmin $slot_xmin and age $max_age"); + + # Extract xid age from the log and report for diagnostics + $log =~ + /The slot's xmin $slot_xmin is (\d+) transactions old/; + my $log_xid_age = $1 // 'N/A'; + diag "xid_age from server log=$log_xid_age, max_slot_xid_age=$max_age, consumed=$consumed_xids XIDs"; +} + +# Verify slot invalidation and wait for autovacuum to advance datfrozenxid +sub verify_invalidation_and_recovery +{ + my ($node, $slot_name, $slot_xmin, $max_age, $consumed_xids) = @_; + + return if $max_age == 0; + + wait_for_xid_aged_invalidation($node, $slot_name); + ok(1, 'autovacuum invalidated slot due to xid_aged'); + + verify_slot_xid_aged_invalidation_in_server_log($node, $slot_name, + $slot_xmin, $max_age, $consumed_xids); + + # Wait for autovacuum to advance datfrozenxid in all databases past the + # wraparound threshold. + $node->poll_query_until( + 'postgres', qq[ + SELECT NOT EXISTS ( + SELECT 1 FROM pg_database + WHERE age(datfrozenxid) > 2000000000 + ); + ]) or die "Timed out waiting for autovacuum to advance datfrozenxid in all databases"; +} + +my $primary6 = PostgreSQL::Test::Cluster->new('primary6'); +$primary6->init(allows_streaming => 'logical'); + +$max_slot_xid_age = 1600000000; # matches vacuum_failsafe_age default +$primary6->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = 1s +}); + +$primary6->start; +$primary6->safe_psql('postgres', "CREATE EXTENSION xid_wraparound"); + +$backup_name = 'backup6'; +$primary6->backup($backup_name); + +my $standby6 = PostgreSQL::Test::Cluster->new('standby6'); +$standby6->init_from_backup($primary6, $backup_name, has_streaming => 1); +$standby6->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb6_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary6->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb6_slot', true)"); + +$standby6->start; + +$primary6->safe_psql('postgres', + "CREATE TABLE tab_int6 AS SELECT generate_series(1,10) AS a"); +$primary6->wait_for_catchup($standby6); + +$primary6->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL FROM pg_replication_slots + WHERE slot_name = 'sb6_slot'; +]) or die "Timed out waiting for sb6_slot xmin from HS feedback"; + +# Capture the slot's xmin for later log verification +my $slot_xmin = $primary6->safe_psql('postgres', + "SELECT xmin FROM pg_replication_slots WHERE slot_name = 'sb6_slot'"); + +# Stop standby; slot xmin persists and holds back datfrozenxid +$standby6->stop; + +# Consume XIDs in 50M chunks; autovacuum (naptime=1s) will invalidate the +# slot once xmin age exceeds max_slot_xid_age. +my $logstart6 = -s $primary6->logfile; +my $chunk = 50_000_000; +my $max_xids = 2_200_000_000; +my $consumed = 0; + +while ($consumed < $max_xids) +{ + $primary6->safe_psql('postgres', "SELECT consume_xids($chunk)"); + $consumed += $chunk; + my $remaining = $max_xids - $consumed; + diag "consumed $consumed / $max_xids XIDs ($remaining remaining)"; +} + +verify_invalidation_and_recovery($primary6, 'sb6_slot', + $slot_xmin, $max_slot_xid_age, $consumed); + +# Consume 1B more XIDs — combining with the 2.2B consumed above, the total +# of 3.2B exceeds the 2^31 (~2.1B) usable XID space (xidStopLimit), i.e. +# more than one full wraparound cycle, proving the system is healthy. +$primary6->safe_psql('postgres', "SELECT consume_xids(1000000000)"); +ok(1, 'writes succeed after autovacuum invalidated the slot'); + +$primary6->stop; + +# Testcase end: XID-age-based slot invalidation with autovacuum (production-like) +# ================================================================================ + done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-01 19:38 Masahiko Sawada <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Masahiko Sawada @ 2026-04-01 19:38 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers On Tue, Mar 31, 2026 at 10:21 AM Bharath Rupireddy <[email protected]> wrote: > > Hi, > > On Mon, Mar 30, 2026 at 5:13 PM Masahiko Sawada <[email protected]> wrote: > > > > I've reviewed the v6 patch. Here are some comments. > > Thank you for reviewing the patch. > > > bool > > vacuum_get_cutoffs(Relation rel, const VacuumParams params, > > - struct VacuumCutoffs *cutoffs) > > + struct VacuumCutoffs *cutoffs, > > + TransactionId *slot_xmin, > > + TransactionId *slot_catalog_xmin) > > > > How about storing both slot_xmin and catalog_xmin into VacuumCutoffs? > > Done. > > > --- > > - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | > > RS_INVAL_IDLE_TIMEOUT, > > + possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | > > + RS_INVAL_XID_AGE; > > + > > + if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, > > _logSegNo, InvalidOid, > > + InvalidTransactionId, > > + max_slot_xid_age > 0 ? > > + ReadNextTransactionId() : > > InvalidTransactionId)) > > > > It's odd to me that we specify RS_INVAL_XID_AGE while passing > > InvalidTransactionId. I think we can specify RS_INVAL_XID_AGE along > > with a valid recentXId only when we'd like to check the slots based on > > their XIDs. > > Done. > > > --- > > + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ > > + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) > > + { > > + TransactionId xidLimit; > > + > > + Assert(TransactionIdIsValid(recentXid)); > > + > > + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); > > + > > > > I think we can avoid calculating xidLimit for every slot by > > calculating it in InvalidatePossiblyObsoleteSlot() and passing it to > > DetermineSlotInvalidationCause(). > > Done. > > > --- > > */ > > TransactionId > > GetOldestNonRemovableTransactionId(Relation rel) > > +{ > > + return GetOldestNonRemovableTransactionIdExt(rel, NULL, NULL); > > +} > > + > > +/* > > + * Same as GetOldestNonRemovableTransactionId(), but also returns the > > + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() > > + * call. This avoids a separate ProcArrayLock acquisition when the caller > > + * needs both values. > > + */ > > +TransactionId > > +GetOldestNonRemovableTransactionIdExt(Relation rel, > > + TransactionId *slot_xmin, > > + TransactionId *slot_catalog_xmin) > > { > > > > I understand that the primary reason why the patch introduces another > > variant of GetOldestNonRemovableTransactionId() is to avoid extra > > ProcArrayLock acquision to get replication slot xmin and catalog_xmin. > > While it's not very elegant, I find that it would not be bad because > > otherwise autovacuum takes extra ProcArrayLock (in shared mode) for > > every table to vacuum. The ProcArrayLock is already known > > high-contented lock it would be better to avoid taking it once more. > > If others think differently, we can just call > > ProcArrayGetReplicationSlotXmin() separately and compare them to the > > limit of XID-age based slot invalidation. > > I understand the concerns around the ProcArrayLock and I think a new > function to return the computed slot's xmin and catalog_xmin is good. > > > Having said that, I personally don't want to add new instructions to > > the existing GetOldestNonRemovableTransactionId(). I guess we might > > want to make both the existing function and new function call a common > > (inline) function that takes ComputeXidHorizonsResult and returns > > appropriate transaction id based on the given relation . > > Done. > > > --- > > + # Do some work to advance xids > > + $node->safe_psql( > > + 'postgres', qq[ > > + do \$\$ > > + begin > > + for i in 1..$nxids loop > > + -- use an exception block so that each iteration eats an XID > > + begin > > + insert into $table_name values (i); > > + exception > > + when division_by_zero then null; > > + end; > > + end loop; > > + end\$\$; > > + ]); > > > > I think it's fater to use pg_current_xact_id() instead. > > Done. I pulled this from an existing test case in 001_stream_rep.pl. > Used the pg_current_xact_id approach. Testing times stay the same i.e. > 9 wallclock secs. > > > --- > > + else > > + { > > + $node->safe_psql('postgres', "VACUUM"); > > + } > > > > We don't need to vacuum all tables here. > > Fixed. > > > --- > > +# Configure primary with XID age settings. Set autovacuum_naptime high so > > +# that the checkpointer (not vacuum) triggers the invalidation. > > +my $max_slot_xid_age = 500; > > +$primary5->append_conf( > > + 'postgresql.conf', qq{ > > +max_slot_xid_age = $max_slot_xid_age > > +autovacuum_naptime = '1h' > > +}); > > > > I think that it's better to disable autovacuum than setting a large number. > > Done. > > > --- > > +# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age > > +# GUC (via checkpoint). > > > > I think that we can say "physical slot" instead of standby's slot to > > avoid confusion as I thought standby's slot is a slot created on the > > standby at the first glance. > > Fixed. > > > --- > > Do we have tests for invalidating slots on the standbys? > > Added a test case for this. > > Please find the attached v7 patches for further review. Thank you! I've reviewed the v7 patch and have some review comments: +# Advance the given number of XIDs +sub advance_xids +{ + my ($node, $nxids) = @_; + my $sql = join(";\n", ("SELECT pg_current_xact_id()") x $nxids); + $node->safe_psql('postgres', $sql); +} I think we can create a procedure on primary5 instance to consume XIDs as follow: $standby5->safe_psql( 'postgres', qq{CREATE PROCEDURE consume_xid(cnt int) AS \$\$ DECLARE i int; BEGIN FOR i in 1..cnt LOOP EXECUTE 'SELECT pg_current_xact_id()'; COMMIT; END LOOP; END; +\$\$ LANGUAGE plpgsql; }); --- +# Create a subscriber node +my $subscriber5 = PostgreSQL::Test::Cluster->new('subscriber5'); +$subscriber5->init(allows_streaming => 'logical'); +$subscriber5->start; Do we really need to create a subscriber for this test? I think we can simply create a logical slot on the primary5 and test the XID-age based slot invalidation. --- I've attached a fixup patch to propose some cleanup and refactoring, including: - changes to invalidation errdetail message. - passing xidLimit instead of recentXid to simplify the invalidation logic. - documentation changes. - comment changes. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com Attachments: [text/x-patch] fixup_for_v7_masahiko.patch (12.7K, 2-fixup_for_v7_masahiko.patch) download | inline diff: diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 46aac59cb20..9ad662b8b6f 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4772,17 +4772,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </term> <listitem> <para> - Invalidate replication slots whose <literal>xmin</literal> (the oldest - transaction that this slot needs the database to retain) or - <literal>catalog_xmin</literal> (the oldest transaction affecting the - system catalogs that this slot needs the database to retain) has reached - the age specified by this setting. This invalidation check happens - during vacuum (both <command>VACUUM</command> command and autovacuum) - and during checkpoints. - A value of zero (which is default) disables this feature. Users can set - this value anywhere from zero to two billion. This parameter can only be - set in the <filename>postgresql.conf</filename> file or on the server - command line. + Invalidate replication slots whose <structfield>xmin</structfield> age + or <structfield>catalog_xmin</structfield> age in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view has exceeded the age specified by this setting. Slot invalidation + due to this limit occurs during vacuum (both <command>VACUUM</command> + command and autovacuum) and during checkpoint. + A value of zero (the default) disables this feature. Users can set + this value anywhere from zero to two billion transactions. This parameter + can only be set in the <filename>postgresql.conf</filename> file or on + the server command line. + </para> + + <para> + The current age of a slot's <literal>xmin</literal> and + <literal>catalog_xmin</literal> can be monitored by applying the + <function>age</function> function to the corresponding columns in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view. </para> <para> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 70c1d5c5559..1aec1bcb79e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7019,8 +7019,8 @@ CreateCheckPoint(int flags) VirtualTransactionId *vxids; int nvxids; int oldXLogAllowed = 0; - uint32 possibleInvalidationCauses; - TransactionId recentXid; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* * An end-of-recovery checkpoint is really a shutdown checkpoint, just @@ -7450,19 +7450,19 @@ CreateCheckPoint(int flags) XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; - recentXid = InvalidTransactionId; - + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; if (max_slot_xid_age > 0) { - possibleInvalidationCauses |= RS_INVAL_XID_AGE; - recentXid = ReadNextTransactionId(); + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); } - if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, InvalidTransactionId, - recentXid)) + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -7743,8 +7743,8 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; - uint32 possibleInvalidationCauses; - TransactionId recentXid; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* Concurrent checkpoint/restartpoint cannot happen */ Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); @@ -7919,19 +7919,19 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - possibleInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; - recentXid = InvalidTransactionId; - + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + xidLimit = InvalidTransactionId; if (max_slot_xid_age > 0) { - possibleInvalidationCauses |= RS_INVAL_XID_AGE; - recentXid = ReadNextTransactionId(); + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); } - if (InvalidateObsoleteReplicationSlots(possibleInvalidationCauses, + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, InvalidTransactionId, - recentXid)) + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 7f36f795b72..cda86b9d50f 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1133,9 +1133,10 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, * that only one vacuum process can be working on a particular table at * any time, and that each vacuum is always an independent transaction. */ - cutoffs->OldestXmin = GetOldestNonRemovableTransactionIdExt(rel, - &cutoffs->slot_xmin, - &cutoffs->slot_catalog_xmin); + cutoffs->OldestXmin = + GetOldestNonRemovableTransactionIdWithSlotXids(rel, + &cutoffs->slot_xmin, + &cutoffs->slot_catalog_xmin); Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 20729d2fb42..e6271e2a519 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -1790,7 +1790,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, long slot_idle_seconds, TransactionId xmin, TransactionId catalog_xmin, - TransactionId recentXid) + TransactionId xidLimit) { StringInfoData err_detail; StringInfoData err_hint; @@ -1838,20 +1838,18 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, case RS_INVAL_XID_AGE: { - Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin)); - - if (TransactionIdIsValid(xmin)) - { - /* translator: %s is a GUC variable name */ - appendStringInfo(&err_detail, _("The slot's xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), - xmin, (int32) (recentXid - xmin), "max_slot_xid_age", max_slot_xid_age); - } - else - { - /* translator: %s is a GUC variable name */ - appendStringInfo(&err_detail, _("The slot's catalog_xmin %u is %d transactions old, which exceeds the configured \"%s\" value of %d."), - catalog_xmin, (int32) (recentXid - catalog_xmin), "max_slot_xid_age", max_slot_xid_age); - } + TransactionId slot_xid = TransactionIdIsValid(xmin) ? xmin : catalog_xmin; + int32 exceeded_by = (int32) (xidLimit - slot_xid); + int32 slot_age = (int32) max_slot_xid_age + exceeded_by; + + Assert(TransactionIdIsValid(slot_xid)); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, + TransactionIdIsValid(xmin) + ? _("The slot's xmin age of %d exceeds the configured \"%s\" of %d by %d transactions") + : _("The slot's catalog xmin age of %d exceeds the configured \"%s\" of %d by %d transactions"), + slot_age, "max_slot_xid_age", max_slot_xid_age, exceeded_by); /* translator: %s is a GUC variable name */ appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), @@ -2033,19 +2031,13 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, - TransactionId recentXid, + TransactionId xidLimit, bool *released_lock_out) { int last_signaled_pid = 0; bool released_lock = false; bool invalidated = false; TimestampTz inactive_since = 0; - TransactionId xidLimit = InvalidTransactionId; - - /* Compute the XID limit once, to avoid redundant work per slot */ - if ((possible_causes & RS_INVAL_XID_AGE) && - TransactionIdIsValid(recentXid)) - xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); for (;;) { @@ -2187,7 +2179,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, slot_idle_secs, s->data.xmin, - s->data.catalog_xmin, recentXid); + s->data.catalog_xmin, xidLimit); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2241,7 +2233,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, slot_idle_secs, s->data.xmin, - s->data.catalog_xmin, recentXid); + s->data.catalog_xmin, xidLimit); /* done with this slot for now */ break; @@ -2284,7 +2276,7 @@ bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, TransactionId snapshotConflictHorizon, - TransactionId recentXid) + TransactionId xidLimit) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2323,7 +2315,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - recentXid, &released_lock)) + xidLimit, &released_lock)) { Assert(released_lock); @@ -3391,7 +3383,7 @@ MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, 0, InvalidOid, InvalidTransactionId, - recentXid); + xidLimit); return invalidated; } diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 9e0acf7309d..898ef4a0833 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1938,10 +1938,11 @@ GlobalVisHorizonKindForRel(Relation rel) } /* - * Helper to return the appropriate oldest non-removable TransactionId from - * pre-computed horizons, based on the relation type. + * A helper function to return the appropriate oldest non-removable + * TransactionId from the pre-computed horizons, based on the relation + * type. */ -static inline TransactionId +static pg_attribute_always_inline TransactionId GetOldestNonRemovableTransactionIdFromHorizons(ComputeXidHorizonsResult *horizons, Relation rel) { @@ -1989,9 +1990,9 @@ GetOldestNonRemovableTransactionId(Relation rel) * needs both values. */ TransactionId -GetOldestNonRemovableTransactionIdExt(Relation rel, - TransactionId *slot_xmin, - TransactionId *slot_catalog_xmin) +GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) { ComputeXidHorizonsResult horizons; diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 0baa7112559..d143b19a7b3 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -371,7 +371,7 @@ extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, TransactionId snapshotConflictHorizon, - TransactionId recentXid); + TransactionId xidLimit); extern bool MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, TransactionId slot_catalog_xmin); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index c198fd22515..a94091ce7fd 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -53,9 +53,9 @@ extern RunningTransactions GetRunningTransactionData(void); extern bool TransactionIdIsInProgress(TransactionId xid); extern TransactionId GetOldestNonRemovableTransactionId(Relation rel); -extern TransactionId GetOldestNonRemovableTransactionIdExt(Relation rel, - TransactionId *slot_xmin, - TransactionId *slot_catalog_xmin); +extern TransactionId GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern TransactionId GetOldestTransactionIdConsideredRunning(void); extern TransactionId GetOldestActiveTransactionId(bool inCommitOnly, bool allDbs); ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-01 21:21 Bharath Rupireddy <[email protected]> parent: Masahiko Sawada <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-04-01 21:21 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Wed, Apr 1, 2026 at 12:39 PM Masahiko Sawada <[email protected]> wrote: > > I've reviewed the v7 patch and have some review comments: Thank you for reviewing the patch. > +# Advance the given number of XIDs > +sub advance_xids > +{ > + my ($node, $nxids) = @_; > + my $sql = join(";\n", ("SELECT pg_current_xact_id()") x $nxids); > + $node->safe_psql('postgres', $sql); > +} > > I think we can create a procedure on primary5 instance to consume XIDs > as follow: > > $standby5->safe_psql( > 'postgres', > qq{CREATE PROCEDURE consume_xid(cnt int) > AS \$\$ > DECLARE > i int; > BEGIN > FOR i in 1..cnt LOOP > EXECUTE 'SELECT pg_current_xact_id()'; > COMMIT; > END LOOP; > END; > +\$\$ > LANGUAGE plpgsql; > }); Agreed. Although the test timings don't improve (9 seconds on my dev machine) after moving to the procedure vs. sending pg_current_xact_id SQL statements, the procedure approach looks better and is more consistent. > --- > +# Create a subscriber node > +my $subscriber5 = PostgreSQL::Test::Cluster->new('subscriber5'); > +$subscriber5->init(allows_streaming => 'logical'); > +$subscriber5->start; > > Do we really need to create a subscriber for this test? I think we can > simply create a logical slot on the primary5 and test the XID-age > based slot invalidation. Nice catch! Removed. > --- > I've attached a fixup patch to propose some cleanup and refactoring, including: > > - changes to invalidation errdetail message. > - passing xidLimit instead of recentXid to simplify the invalidation logic. > - documentation changes. > - comment changes. I took the above changes into v8 and fixed a typo in using xidLimit instead of slotXidLimit. Please find the attached v8 patches for further review. Thank you! -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/x-patch] v8-0002-Add-more-tests-for-XID-age-slot-invalidation.patch (6.3K, 2-v8-0002-Add-more-tests-for-XID-age-slot-invalidation.patch) download | inline diff: From 4ec56140a2dab24bf303b374465127ce93404cc9 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Wed, 1 Apr 2026 20:56:38 +0000 Subject: [PATCH v8 2/2] Add more tests for XID age slot invalidation Consume XIDs up to wraparound WARNING limits with max_slot_xid_age matching vacuum_failsafe_age (1.6B). Verify that autovacuum invalidates the inactive replication slot (XID-age-based invalidation), unblocks datfrozenxid advancement, and prevents wraparound without any intervention. --- src/test/recovery/Makefile | 3 +- src/test/recovery/t/019_replslot_limit.pl | 130 ++++++++++++++++++++++ 2 files changed, 132 insertions(+), 1 deletion(-) diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile index d41aaaf8ae1..5c3d2c89941 100644 --- a/src/test/recovery/Makefile +++ b/src/test/recovery/Makefile @@ -12,7 +12,8 @@ EXTRA_INSTALL=contrib/pg_prewarm \ contrib/pg_stat_statements \ contrib/test_decoding \ - src/test/modules/injection_points + src/test/modules/injection_points \ + src/test/modules/xid_wraparound subdir = src/test/recovery top_builddir = ../../.. diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index c5f6e5bba87..c9b273e1d7c 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -698,4 +698,134 @@ $primary5->stop; # Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC # ============================================================================= +# ================================================================================= +# Testcase start: XID-age-based slot invalidation with autovacuum (production-like) + +# Standby sets slot xmin via HS feedback, disconnects, XIDs are consumed. +# max_slot_xid_age is set to vacuum_failsafe_age (1.6B) so autovacuum +# invalidates the slot before entering failsafe mode, unblocking +# datfrozenxid advancement and avoiding XID wraparound without manual +# VACUUM or downtime. + +# Verify server log shows slot invalidation by autovacuum worker +sub verify_slot_xid_aged_invalidation_in_server_log +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + my $log = slurp_file($node->logfile); + + # Verify the invalidation was performed by an autovacuum worker + like($log, + qr/autovacuum worker\[\d+\] LOG:\s+invalidating obsolete replication slot "$slot_name"/, + "server log: $slot_name invalidated by autovacuum worker"); + + # Verify DETAIL shows the xmin age exceeding max_slot_xid_age + like($log, + qr/autovacuum worker\[\d+\] DETAIL:\s+The slot's (?:catalog )?xmin age of (\d+) exceeds the configured "max_slot_xid_age" of $max_age by (\d+) transactions/, + "server log: DETAIL shows xmin age exceeds max_slot_xid_age $max_age"); + + # Extract xid age from the log and report for diagnostics + $log =~ + /The slot's (?:catalog )?xmin age of (\d+) exceeds the configured "max_slot_xid_age" of $max_age by (\d+)/; + my $log_xid_age = $1 // 'N/A'; + my $exceeded_by = $2 // 'N/A'; + diag "xid_age from server log=$log_xid_age, exceeded_by=$exceeded_by, max_slot_xid_age=$max_age, consumed=$consumed_xids XIDs"; +} + +# Verify slot invalidation and wait for autovacuum to advance datfrozenxid +sub verify_invalidation_and_recovery +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + return if $max_age == 0; + + wait_for_xid_aged_invalidation($node, $slot_name); + ok(1, 'autovacuum invalidated slot due to xid_aged'); + + verify_slot_xid_aged_invalidation_in_server_log($node, $slot_name, + $max_age, $consumed_xids); + + # Wait for autovacuum to advance datfrozenxid in all databases past the + # wraparound threshold. + $node->poll_query_until( + 'postgres', qq[ + SELECT NOT EXISTS ( + SELECT 1 FROM pg_database + WHERE age(datfrozenxid) > 2000000000 + ); + ]) or die "Timed out waiting for autovacuum to advance datfrozenxid in all databases"; +} + +my $primary6 = PostgreSQL::Test::Cluster->new('primary6'); +$primary6->init(allows_streaming => 'logical'); + +$max_slot_xid_age = 1600000000; # matches vacuum_failsafe_age default +$primary6->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = 1s +}); + +$primary6->start; +$primary6->safe_psql('postgres', "CREATE EXTENSION xid_wraparound"); + +$backup_name = 'backup6'; +$primary6->backup($backup_name); + +my $standby6 = PostgreSQL::Test::Cluster->new('standby6'); +$standby6->init_from_backup($primary6, $backup_name, has_streaming => 1); +$standby6->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb6_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary6->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb6_slot', true)"); + +$standby6->start; + +$primary6->safe_psql('postgres', + "CREATE TABLE tab_int6 AS SELECT generate_series(1,10) AS a"); +$primary6->wait_for_catchup($standby6); + +$primary6->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL FROM pg_replication_slots + WHERE slot_name = 'sb6_slot'; +]) or die "Timed out waiting for sb6_slot xmin from HS feedback"; + +# Stop standby; slot xmin persists and holds back datfrozenxid +$standby6->stop; + +# Consume XIDs in 50M chunks; autovacuum (naptime=1s) will invalidate the +# slot once xmin age exceeds max_slot_xid_age. +my $logstart6 = -s $primary6->logfile; +my $chunk = 50_000_000; +my $max_xids = 2_200_000_000; +my $consumed = 0; + +while ($consumed < $max_xids) +{ + $primary6->safe_psql('postgres', "SELECT consume_xids($chunk)"); + $consumed += $chunk; + my $remaining = $max_xids - $consumed; + diag "consumed $consumed / $max_xids XIDs ($remaining remaining)"; +} + +verify_invalidation_and_recovery($primary6, 'sb6_slot', + $max_slot_xid_age, $consumed); + +# Consume 1B more XIDs — combining with the 2.2B consumed above, the total +# of 3.2B exceeds the 2^31 (~2.1B) usable XID space (xidStopLimit), i.e. +# more than one full wraparound cycle, proving the system is healthy. +$primary6->safe_psql('postgres', "SELECT consume_xids(1000000000)"); +ok(1, 'writes succeed after autovacuum invalidated the slot'); + +$primary6->stop; + +# Testcase end: XID-age-based slot invalidation with autovacuum (production-like) +# ================================================================================ + done_testing(); -- 2.47.3 [application/x-patch] v8-0001-Add-XID-age-based-replication-slot-invalidation.patch (30.9K, 3-v8-0001-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From ef7d5550d0ab29b5581b42c3fdb77c1e73090a04 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Wed, 1 Apr 2026 20:55:51 +0000 Subject: [PATCH v8 1/2] Add XID age based replication slot invalidation Introduce max_slot_xid_age, a GUC that invalidates replication slots whose xmin or catalog_xmin exceeds the specified age. Disabled by default. Idle or forgotten replication slots can hold back vacuum, leading to bloat and eventually XID wraparound. In the worst case this requires dropping the slot and single-user mode vacuuming. This setting avoids that by proactively invalidating slots that have fallen too far behind. Invalidation checks are performed once per relation during vacuum (both vacuum command and autovacuum), and also by the checkpointer during checkpoints and restartpoints. --- doc/src/sgml/config.sgml | 47 ++++++ doc/src/sgml/system-views.sgml | 8 + src/backend/access/heap/vacuumlazy.c | 15 ++ src/backend/access/transam/xlog.c | 34 +++- src/backend/commands/vacuum.c | 5 +- src/backend/replication/slot.c | 122 +++++++++++++- src/backend/storage/ipc/procarray.c | 61 +++++-- src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/commands/vacuum.h | 9 + src/include/replication/slot.h | 10 +- src/include/storage/procarray.h | 3 + src/test/recovery/t/019_replslot_limit.pl | 158 ++++++++++++++++++ 14 files changed, 459 insertions(+), 26 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 229f41353eb..9ad662b8b6f 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,53 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <structfield>xmin</structfield> age + or <structfield>catalog_xmin</structfield> age in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view has exceeded the age specified by this setting. Slot invalidation + due to this limit occurs during vacuum (both <command>VACUUM</command> + command and autovacuum) and during checkpoint. + A value of zero (the default) disables this feature. Users can set + this value anywhere from zero to two billion transactions. This parameter + can only be set in the <filename>postgresql.conf</filename> file or on + the server command line. + </para> + + <para> + The current age of a slot's <literal>xmin</literal> and + <literal>catalog_xmin</literal> can be monitored by applying the + <function>age</function> function to the corresponding columns in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + + <para> + Note that this invalidation mechanism is not applicable for slots + on the standby server that are being synced from the primary server + (i.e., standby slots having + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> + value <literal>true</literal>). Synced slots are always considered to + be inactive because they don't perform logical decoding to produce + changes. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 88c71cd85b6..4b716122107 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -147,6 +147,7 @@ #include "pgstat.h" #include "portability/instr_time.h" #include "postmaster/autovacuum.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/freespace.h" #include "storage/latch.h" @@ -799,6 +800,20 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params, * to increase the number of dead tuples it can prune away.) */ vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + + /* + * Try to invalidate XID-aged replication slots. Use the slot xmin values + * obtained from the same horizons computation that produced OldestXmin, + * avoiding an extra ProcArrayLock acquisition. + */ + if (MaybeInvalidateXIDAgedSlots(vacrel->cutoffs.slot_xmin, + vacrel->cutoffs.slot_catalog_xmin)) + { + /* Recompute cutoffs after slot invalidation */ + vacrel->aggressive = vacuum_get_cutoffs(rel, params, + &vacrel->cutoffs); + } + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 2c1c6f88b74..eac73091172 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7019,6 +7019,8 @@ CreateCheckPoint(int flags) VirtualTransactionId *vxids; int nvxids; int oldXLogAllowed = 0; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* * An end-of-recovery checkpoint is really a shutdown checkpoint, just @@ -7447,9 +7449,20 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; + if (max_slot_xid_age > 0) + { + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + } + + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -7730,6 +7743,8 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* Concurrent checkpoint/restartpoint cannot happen */ Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); @@ -7904,9 +7919,19 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; + if (max_slot_xid_age > 0) + { + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + } + + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -8770,6 +8795,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 0ed363d1c85..cda86b9d50f 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1133,7 +1133,10 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, * that only one vacuum process can be working on a particular table at * any time, and that each vacuum is always an independent transaction. */ - cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + cutoffs->OldestXmin = + GetOldestNonRemovableTransactionIdWithSlotXids(rel, + &cutoffs->slot_xmin, + &cutoffs->slot_catalog_xmin); Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..e6271e2a519 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin older + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1780,7 +1787,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId xidLimit) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1835,28 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + TransactionId slot_xid = TransactionIdIsValid(xmin) ? xmin : catalog_xmin; + int32 exceeded_by = (int32) (xidLimit - slot_xid); + int32 slot_age = (int32) max_slot_xid_age + exceeded_by; + + Assert(TransactionIdIsValid(slot_xid)); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, + TransactionIdIsValid(xmin) + ? _("The slot's xmin age of %d exceeds the configured \"%s\" of %d by %d transactions") + : _("The slot's catalog xmin age of %d exceeds the configured \"%s\" of %d by %d transactions"), + slot_age, "max_slot_xid_age", max_slot_xid_age, exceeded_by); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1863,6 +1895,25 @@ CanInvalidateIdleSlot(ReplicationSlot *s) !(RecoveryInProgress() && s->data.synced)); } +/* + * Can we invalidate an XID-aged replication slot? + * + * XID-aged based invalidation is allowed to the given slot when: + * + * 1. Max XID-age is set + * 2. Slot has valid xmin or catalog_xmin + * 3. The slot is not being synced from the primary while the server is in + * recovery. + */ +static inline bool +CanInvalidateXidAgedSlot(ReplicationSlot *s) +{ + return (max_slot_xid_age != 0 && + (TransactionIdIsValid(s->data.xmin) || + TransactionIdIsValid(s->data.catalog_xmin)) && + !(RecoveryInProgress() && s->data.synced)); +} + /* * DetermineSlotInvalidationCause - Determine the cause for which a slot * becomes invalid among the given possible causes. @@ -1874,6 +1925,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1945,6 +1997,18 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) + { + Assert(TransactionIdIsValid(xidLimit)); + + if ((TransactionIdIsValid(s->data.xmin) && + TransactionIdPrecedes(s->data.xmin, xidLimit)) || + (TransactionIdIsValid(s->data.catalog_xmin) && + TransactionIdPrecedes(s->data.catalog_xmin, xidLimit))) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1967,6 +2031,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2019,6 +2084,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + xidLimit, &inactive_since, now); @@ -2112,7 +2178,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, xidLimit); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2232,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, xidLimit); /* done with this slot for now */ break; @@ -2192,6 +2260,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2205,7 +2275,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, + TransactionId xidLimit) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2244,7 +2315,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + xidLimit, &released_lock)) { Assert(released_lock); @@ -3275,3 +3346,44 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Invalidate replication slots whose XID age exceeds the max_slot_xid_age + * GUC. + * + * The slot_xmin and slot_catalog_xmin are the replication slot xmin values + * obtained from the same ComputeXidHorizons() call that computed OldestXmin + * during vacuum. Using these avoids a separate ProcArrayLock acquisition. + * + * Returns true if at least one slot was invalidated. + */ +bool +MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, + TransactionId slot_catalog_xmin) +{ + TransactionId recentXid; + TransactionId xidLimit; + bool invalidated = false; + + if (max_slot_xid_age == 0) + return false; + + recentXid = ReadNextTransactionId(); + xidLimit = TransactionIdRetreatedBy(recentXid, max_slot_xid_age); + + /* + * Invalidate possibly obsolete slots based on XID-age, if either slot's + * xmin or catalog_xmin is older than the cutoff. + */ + if ((TransactionIdIsValid(slot_xmin) && + TransactionIdPrecedes(slot_xmin, xidLimit)) || + (TransactionIdIsValid(slot_catalog_xmin) && + TransactionIdPrecedes(slot_catalog_xmin, xidLimit))) + invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, + InvalidOid, + InvalidTransactionId, + xidLimit); + + return invalidated; +} diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index cc207cb56e3..898ef4a0833 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1937,6 +1937,31 @@ GlobalVisHorizonKindForRel(Relation rel) return VISHORIZON_TEMP; } +/* + * A helper function to return the appropriate oldest non-removable + * TransactionId from the pre-computed horizons, based on the relation + * type. + */ +static pg_attribute_always_inline TransactionId +GetOldestNonRemovableTransactionIdFromHorizons(ComputeXidHorizonsResult *horizons, + Relation rel) +{ + switch (GlobalVisHorizonKindForRel(rel)) + { + case VISHORIZON_SHARED: + return horizons->shared_oldest_nonremovable; + case VISHORIZON_CATALOG: + return horizons->catalog_oldest_nonremovable; + case VISHORIZON_DATA: + return horizons->data_oldest_nonremovable; + case VISHORIZON_TEMP: + return horizons->temp_oldest_nonremovable; + } + + /* just to prevent compiler warnings */ + return InvalidTransactionId; +} + /* * Return the oldest XID for which deleted tuples must be preserved in the * passed table. @@ -1955,20 +1980,30 @@ GetOldestNonRemovableTransactionId(Relation rel) ComputeXidHorizons(&horizons); - switch (GlobalVisHorizonKindForRel(rel)) - { - case VISHORIZON_SHARED: - return horizons.shared_oldest_nonremovable; - case VISHORIZON_CATALOG: - return horizons.catalog_oldest_nonremovable; - case VISHORIZON_DATA: - return horizons.data_oldest_nonremovable; - case VISHORIZON_TEMP: - return horizons.temp_oldest_nonremovable; - } + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); +} - /* just to prevent compiler warnings */ - return InvalidTransactionId; +/* + * Same as GetOldestNonRemovableTransactionId(), but also returns the + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() + * call. This avoids a separate ProcArrayLock acquisition when the caller + * needs both values. + */ +TransactionId +GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) +{ + ComputeXidHorizonsResult horizons; + + ComputeXidHorizons(&horizons); + + if (slot_xmin) + *slot_xmin = horizons.slot_xmin; + if (slot_catalog_xmin) + *slot_catalog_xmin = horizons.slot_catalog_xmin; + + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); } /* diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index de9092fdf5b..d60f39ec08e 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -504,7 +504,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 0a862693fcd..ca3cc8417da 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2089,6 +2089,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2000000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index cf15597385b..055eba56bdf 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 5b8023616c0..931500f0665 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -287,6 +287,15 @@ struct VacuumCutoffs */ TransactionId FreezeLimit; MultiXactId MultiXactCutoff; + + /* + * Replication slot xmin and catalog_xmin values obtained from the same + * ComputeXidHorizons() call that computed OldestXmin. These are used for + * XID-age-based replication slot invalidation without requiring an extra + * ProcArrayLock acquisition. + */ + TransactionId slot_xmin; + TransactionId slot_catalog_xmin; }; /* diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..d143b19a7b3 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -367,7 +370,10 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId xidLimit); +extern bool MaybeInvalidateXIDAgedSlots(TransactionId slot_xmin, + TransactionId slot_catalog_xmin); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index abdf021e66e..a94091ce7fd 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -53,6 +53,9 @@ extern RunningTransactions GetRunningTransactionData(void); extern bool TransactionIdIsInProgress(TransactionId xid); extern TransactionId GetOldestNonRemovableTransactionId(Relation rel); +extern TransactionId GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern TransactionId GetOldestTransactionIdConsideredRunning(void); extern TransactionId GetOldestActiveTransactionId(bool inCommitOnly, bool allDbs); diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..c5f6e5bba87 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,162 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Wait for the given slot to be invalidated with reason 'xid_aged' +sub wait_for_xid_aged_invalidation +{ + my ($node, $slot_name) = @_; + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) or die "Timed out waiting for slot $slot_name to be invalidated"; +} + +# ===================================================================== +# Testcase start: Invalidate physical slot due to max_slot_xid_age GUC + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Disable autovacuum so checkpointer triggers the invalidation +my $max_slot_xid_age = 100; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); + +$primary5->start; + +# Create a procedure to consume XIDs +$primary5->safe_psql( + 'postgres', qq{ + CREATE PROCEDURE consume_xid(cnt int) + AS \$\$ + DECLARE + i int; + BEGIN + FOR i IN 1..cnt LOOP + EXECUTE 'SELECT pg_current_xact_id()'; + COMMIT; + END LOOP; + END; + \$\$ LANGUAGE plpgsql; +}); + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create standby with HS feedback so the slot gains an xmin +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); +$primary5->wait_for_catchup($standby5); + +# Wait for the physical slot to get xmin via hot_standby_feedback +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin from HS feedback"; + +# Stop standby so the slot becomes inactive with its xmin frozen +$standby5->stop; + +# Advance XIDs past 2x max_slot_xid_age so the slot's xmin is stale enough +$primary5->safe_psql('postgres', "CALL consume_xid(" . (2 * $max_slot_xid_age) . ")"); +$primary5->safe_psql('postgres', "CHECKPOINT"); +wait_for_xid_aged_invalidation($primary5, 'sb5_slot'); +ok(1, "physical slot invalidated due to XID age (via checkpoint)"); + +# Testcase end: Invalidate physical slot due to max_slot_xid_age GUC +# =================================================================== + +# ==================================================================== +# Testcase start: Invalidate logical slot due to max_slot_xid_age GUC + +# Create a logical slot directly on the primary (no subscriber needed). +# The slot gets a catalog_xmin immediately upon creation. +$primary5->safe_psql('postgres', + "SELECT pg_create_logical_replication_slot('lsub5_slot', 'pgoutput')"); + +$primary5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin"; + +# Advance XIDs past 2x max_slot_xid_age so the slot's catalog_xmin is stale enough +$primary5->safe_psql('postgres', "CALL consume_xid(" . (2 * $max_slot_xid_age) . ")"); +$primary5->safe_psql('postgres', "VACUUM tab_int5"); +wait_for_xid_aged_invalidation($primary5, 'lsub5_slot'); +ok(1, "logical slot invalidated due to XID age (via vacuum)"); + +# Testcase end: Invalidate logical slot due to max_slot_xid_age GUC +# ================================================================== + +# =============================================================================== +# Testcase start: Invalidate logical slot on standby due to max_slot_xid_age GUC + +# Disable max_slot_xid_age on primary and recreate the streaming slot +$primary5->safe_psql('postgres', + "ALTER SYSTEM SET max_slot_xid_age = 0; SELECT pg_reload_conf();"); +$primary5->safe_psql('postgres', + "SELECT pg_drop_replication_slot('sb5_slot')"); +$primary5->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb5_slot', true)"); +$standby5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); +$standby5->start; + +$primary5->wait_for_catchup($standby5); + +$standby5->create_logical_slot_on_standby($primary5, 'sb5_logical_slot', + 'postgres'); + +$standby5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_logical_slot'; +]) or die "Timed out waiting for sb5_logical_slot catalog_xmin"; + +# Advance XIDs on primary, replay on standby, then restartpoint to invalidate +$primary5->safe_psql('postgres', "CALL consume_xid(" . (2 * $max_slot_xid_age) . ")"); +$primary5->safe_psql('postgres', "CHECKPOINT"); +$primary5->wait_for_catchup($standby5); +$standby5->safe_psql('postgres', "CHECKPOINT"); + +wait_for_xid_aged_invalidation($standby5, 'sb5_logical_slot'); +ok(1, "logical (standby) slot invalidated due to XID age (via restartpoint)"); + +$standby5->stop; +$primary5->stop; + +# Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC +# ============================================================================= + done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-03 19:04 Bharath Rupireddy <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-04-03 19:04 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Wed, Apr 1, 2026 at 2:21 PM Bharath Rupireddy <[email protected]> wrote: > > On Wed, Apr 1, 2026 at 12:39 PM Masahiko Sawada <[email protected]> wrote: > > > > I've reviewed the v7 patch and have some review comments: > > Thank you for reviewing the patch. > > I took the above changes into v8 and fixed a typo in using xidLimit > instead of slotXidLimit. > > Please find the attached v8 patches for further review. Thank you! Thank you, Sawada-san, for reviewing and providing some offlist comments. 1/ Included a note in the docs to say that logical replication slots are also affected by XID age GUC (similar to idle_replication_slot_timeout). 2/ Added the code to disable the XID age invalidation in pg_createsubscriber similar to timeout invalidation. Commit 72e6c08fea ensured that none of the logical replication slots get invalidated during the upgrade. (I believe the work that pg_upgrade and pg_createsubscriber do is more important, and the slots created and used by them or slots in use during those processes must not interfere with the upgrade or creating a logical replica from a standby.) 3/ Changed the max value of XID age GUC to be equal to that of vacuum failsafe age. In my opinion, the best use of max_slot_xid_age would be to set it equal to or a little less than vacuum_failsafe_age. Also added a note in the docs about this. 4/ Changed variable names for consistency. 5/ Added code to MaybeInvalidateXIDAgedSlots() to skip the slot invalidation attempt (unnecessary work) when slots are not the reason for holding back the OldestXmin. Added an equality check to see if OldestXmin is either OldestSlotXmin or OldestSlotCatalogXmin (all these OldestXXXXmins are computed from the same ComputeXidHorizons() call). This should allow us to skip the slot invalidation attempt when a backend is holding the xmin back (a long-running transaction, for example). Please find the attached v9 patches for further review. Thank you! -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/octet-stream] v9-0001-Add-XID-age-based-replication-slot-invalidation.patch (33.6K, 2-v9-0001-Add-XID-age-based-replication-slot-invalidation.patch) download | inline diff: From c68a15527999b1f52ed648873b03878a6f1aed32 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Fri, 3 Apr 2026 18:27:15 +0000 Subject: [PATCH v9 1/2] Add XID age based replication slot invalidation Introduce max_slot_xid_age, a GUC that invalidates replication slots whose xmin or catalog_xmin exceeds the specified age. Disabled by default. Idle or forgotten replication slots can hold back vacuum, leading to bloat and eventually XID wraparound. In the worst case this requires dropping the slot and single-user mode vacuuming. This setting avoids that by proactively invalidating slots that have fallen too far behind. Invalidation checks are performed once per relation during vacuum (both vacuum command and autovacuum), and also by the checkpointer during checkpoints and restartpoints. --- doc/src/sgml/config.sgml | 54 ++++++ doc/src/sgml/logical-replication.sgml | 4 +- doc/src/sgml/system-views.sgml | 8 + src/backend/access/heap/vacuumlazy.c | 15 ++ src/backend/access/transam/xlog.c | 34 +++- src/backend/commands/vacuum.c | 5 +- src/backend/replication/slot.c | 136 +++++++++++++- src/backend/storage/ipc/procarray.c | 61 ++++-- src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/bin/pg_basebackup/pg_createsubscriber.c | 2 +- src/include/commands/vacuum.h | 7 + src/include/replication/slot.h | 11 +- src/include/storage/procarray.h | 3 + src/test/recovery/t/019_replslot_limit.pl | 175 ++++++++++++++++++ 16 files changed, 500 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index fdb77df0fdb..d51a1060bee 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,60 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <structfield>xmin</structfield> age + or <structfield>catalog_xmin</structfield> age in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view has exceeded the age specified by this setting. Slot invalidation + due to this limit occurs during vacuum (both <command>VACUUM</command> + command and autovacuum) and during checkpoint. + A value of zero (the default) disables this feature. Users can set + this value anywhere from zero to two billion transactions. This parameter + can only be set in the <filename>postgresql.conf</filename> file or on + the server command line. + </para> + + <para> + The current age of a slot's <literal>xmin</literal> and + <literal>catalog_xmin</literal> can be monitored by applying the + <function>age</function> function to the corresponding columns in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + + <para> + It is recommended to set <varname>max_slot_xid_age</varname> + to a value equal to or slightly less than + <xref linkend="guc-vacuum-failsafe-age"/>, so that the slot holding the + vacuum back is invalidated before vacuum enters failsafe mode. + </para> + + <para> + Note that this invalidation mechanism is not applicable for slots + on the standby server that are being synced from the primary server + (i.e., standby slots having + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> + value <literal>true</literal>). Synced slots are always considered to + be inactive because they don't perform logical decoding to produce + changes. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml index 23b268273b9..a865a5e6c28 100644 --- a/doc/src/sgml/logical-replication.sgml +++ b/doc/src/sgml/logical-replication.sgml @@ -2649,7 +2649,9 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER <para> Logical replication slots are also affected by - <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>. + <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link> + and + <link linkend="guc-max-slot-xid-age"><varname>max_slot_xid_age</varname></link>. </para> <para> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 88c71cd85b6..bf048167679 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -147,6 +147,7 @@ #include "pgstat.h" #include "portability/instr_time.h" #include "postmaster/autovacuum.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/freespace.h" #include "storage/latch.h" @@ -799,6 +800,20 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params, * to increase the number of dead tuples it can prune away.) */ vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + + /* Try to invalidate XID-aged replication slots */ + if (MaybeInvalidateXIDAgedSlots(vacrel->cutoffs.OldestXmin, + vacrel->cutoffs.OldestSlotXmin, + vacrel->cutoffs.OldestSlotCatalogXmin)) + { + /* + * Some slots have been invalidated; re-compute the vacuum cutoffs and + * aggresiveness. + */ + vacrel->aggressive = vacuum_get_cutoffs(rel, params, + &vacrel->cutoffs); + } + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 2c1c6f88b74..eac73091172 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7019,6 +7019,8 @@ CreateCheckPoint(int flags) VirtualTransactionId *vxids; int nvxids; int oldXLogAllowed = 0; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* * An end-of-recovery checkpoint is really a shutdown checkpoint, just @@ -7447,9 +7449,20 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; + if (max_slot_xid_age > 0) + { + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + } + + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -7730,6 +7743,8 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* Concurrent checkpoint/restartpoint cannot happen */ Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); @@ -7904,9 +7919,19 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; + if (max_slot_xid_age > 0) + { + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + } + + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -8770,6 +8795,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 0ed363d1c85..2243a7dcb80 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1133,7 +1133,10 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, * that only one vacuum process can be working on a particular table at * any time, and that each vacuum is always an independent transaction. */ - cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + cutoffs->OldestXmin = + GetOldestNonRemovableTransactionIdWithSlotXids(rel, + &cutoffs->OldestSlotXmin, + &cutoffs->OldestSlotCatalogXmin); Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..712f535090b 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin older + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1780,7 +1787,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId xidLimit) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1835,29 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + TransactionId slot_xid = TransactionIdIsValid(xmin) ? xmin : catalog_xmin; + int32 exceeded_by = (int32) (xidLimit - slot_xid); + int32 slot_age = (int32) max_slot_xid_age + exceeded_by; + + /* Either the slot's xmin or catalog_xmin must be valid */ + Assert(TransactionIdIsValid(slot_xid)); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, + TransactionIdIsValid(xmin) + ? _("The slot's xmin age of %d exceeds the configured \"%s\" of %d by %d transactions") + : _("The slot's catalog xmin age of %d exceeds the configured \"%s\" of %d by %d transactions"), + slot_age, "max_slot_xid_age", max_slot_xid_age, exceeded_by); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1863,6 +1896,25 @@ CanInvalidateIdleSlot(ReplicationSlot *s) !(RecoveryInProgress() && s->data.synced)); } +/* + * Can we invalidate an XID-aged replication slot? + * + * XID-aged based invalidation is allowed to the given slot when: + * + * 1. Max XID-age is set + * 2. Slot has valid xmin or catalog_xmin + * 3. The slot is not being synced from the primary while the server is in + * recovery. + */ +static inline bool +CanInvalidateXidAgedSlot(ReplicationSlot *s) +{ + return (max_slot_xid_age != 0 && + (TransactionIdIsValid(s->data.xmin) || + TransactionIdIsValid(s->data.catalog_xmin)) && + !(RecoveryInProgress() && s->data.synced)); +} + /* * DetermineSlotInvalidationCause - Determine the cause for which a slot * becomes invalid among the given possible causes. @@ -1874,6 +1926,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1945,6 +1998,18 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) + { + Assert(TransactionIdIsValid(xidLimit)); + + if ((TransactionIdIsValid(s->data.xmin) && + TransactionIdPrecedes(s->data.xmin, xidLimit)) || + (TransactionIdIsValid(s->data.catalog_xmin) && + TransactionIdPrecedes(s->data.catalog_xmin, xidLimit))) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1967,6 +2032,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2019,6 +2085,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + xidLimit, &inactive_since, now); @@ -2112,7 +2179,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, xidLimit); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2233,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, xidLimit); /* done with this slot for now */ break; @@ -2192,6 +2261,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2205,7 +2276,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, + TransactionId xidLimit) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2244,7 +2316,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + xidLimit, &released_lock)) { Assert(released_lock); @@ -3275,3 +3347,57 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) ConditionVariableCancelSleep(); } + +/* + * Invalidate replication slots whose XID age exceeds max_slot_xid_age. + * + * The caller supplies oldest xmin along with the oldest slot xmin and oldest + * catalog xmin, all obtained from the same ComputeXidHorizons() call. We use them + * to avoid unnecessary work: if a replication slot is not what's holding back + * OldestXmin (e.g., a long-running transaction is), invalidating slots would + * not help. Even when a slot is holding back OldestXmin, we still skip the + * invalidation call if OldestXmin has not yet exceeded max_slot_xid_age. + * + * Returns true if at least one slot was invalidated. + */ +bool +MaybeInvalidateXIDAgedSlots(TransactionId OldestXmin, + TransactionId OldestSlotXmin, + TransactionId OldestSlotCatalogXmin) +{ + TransactionId xidLimit; + bool slotHoldsOldestXminBack; + + if (max_slot_xid_age == 0) + return false; + + Assert(TransactionIdIsNormal(OldestXmin)); + + /* + * Check if a replication slot's xmin or catalog_xmin is what's holding + * back OldestXmin. If not, skip the unnecessary work. + */ + slotHoldsOldestXminBack = + (TransactionIdIsValid(OldestSlotXmin) && + TransactionIdEquals(OldestXmin, OldestSlotXmin)) || + (TransactionIdIsValid(OldestSlotCatalogXmin) && + TransactionIdEquals(OldestXmin, OldestSlotCatalogXmin)); + + if (!slotHoldsOldestXminBack) + return false; + + xidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + + /* + * A replication slot is holding back OldestXmin. Now check whether the + * slot holding it back has also exceeded the XID age. + */ + if (TransactionIdPrecedes(OldestXmin, xidLimit)) + return InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, InvalidOid, + InvalidTransactionId, + xidLimit); + + return false; +} diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index cc207cb56e3..898ef4a0833 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1937,6 +1937,31 @@ GlobalVisHorizonKindForRel(Relation rel) return VISHORIZON_TEMP; } +/* + * A helper function to return the appropriate oldest non-removable + * TransactionId from the pre-computed horizons, based on the relation + * type. + */ +static pg_attribute_always_inline TransactionId +GetOldestNonRemovableTransactionIdFromHorizons(ComputeXidHorizonsResult *horizons, + Relation rel) +{ + switch (GlobalVisHorizonKindForRel(rel)) + { + case VISHORIZON_SHARED: + return horizons->shared_oldest_nonremovable; + case VISHORIZON_CATALOG: + return horizons->catalog_oldest_nonremovable; + case VISHORIZON_DATA: + return horizons->data_oldest_nonremovable; + case VISHORIZON_TEMP: + return horizons->temp_oldest_nonremovable; + } + + /* just to prevent compiler warnings */ + return InvalidTransactionId; +} + /* * Return the oldest XID for which deleted tuples must be preserved in the * passed table. @@ -1955,20 +1980,30 @@ GetOldestNonRemovableTransactionId(Relation rel) ComputeXidHorizons(&horizons); - switch (GlobalVisHorizonKindForRel(rel)) - { - case VISHORIZON_SHARED: - return horizons.shared_oldest_nonremovable; - case VISHORIZON_CATALOG: - return horizons.catalog_oldest_nonremovable; - case VISHORIZON_DATA: - return horizons.data_oldest_nonremovable; - case VISHORIZON_TEMP: - return horizons.temp_oldest_nonremovable; - } + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); +} - /* just to prevent compiler warnings */ - return InvalidTransactionId; +/* + * Same as GetOldestNonRemovableTransactionId(), but also returns the + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() + * call. This avoids a separate ProcArrayLock acquisition when the caller + * needs both values. + */ +TransactionId +GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) +{ + ComputeXidHorizonsResult horizons; + + ComputeXidHorizons(&horizons); + + if (slot_xmin) + *slot_xmin = horizons.slot_xmin; + if (slot_catalog_xmin) + *slot_catalog_xmin = horizons.slot_catalog_xmin; + + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); } /* diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index de9092fdf5b..d60f39ec08e 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -504,7 +504,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index e556b8844d8..844ff8b5dea 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2089,6 +2089,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2100000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 2c5e98d1d4d..0038026a570 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c index 37631f700af..9ebc71fcac2 100644 --- a/src/bin/pg_basebackup/pg_createsubscriber.c +++ b/src/bin/pg_basebackup/pg_createsubscriber.c @@ -1901,7 +1901,7 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_ appendPQExpBufferStr(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\""); /* Prevent unintended slot invalidation */ - appendPQExpBufferStr(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\""); + appendPQExpBufferStr(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0 -c max_slot_xid_age=0\""); if (restricted_access) { diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 5b8023616c0..1a03336d6c0 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -287,6 +287,13 @@ struct VacuumCutoffs */ TransactionId FreezeLimit; MultiXactId MultiXactCutoff; + + /* + * Oldest xmin and catalog xmin of any replication slot obtained from the + * same ComputeXidHorizons() call that computed OldestXmin. + */ + TransactionId OldestSlotXmin; + TransactionId OldestSlotCatalogXmin; }; /* diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..91f483024ce 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -367,7 +370,11 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId xidLimit); +extern bool MaybeInvalidateXIDAgedSlots(TransactionId OldestXmin, + TransactionId OldestSlotXmin, + TransactionId OldestSlotCatalogXmin); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index abdf021e66e..a94091ce7fd 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -53,6 +53,9 @@ extern RunningTransactions GetRunningTransactionData(void); extern bool TransactionIdIsInProgress(TransactionId xid); extern TransactionId GetOldestNonRemovableTransactionId(Relation rel); +extern TransactionId GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern TransactionId GetOldestTransactionIdConsideredRunning(void); extern TransactionId GetOldestActiveTransactionId(bool inCommitOnly, bool allDbs); diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..8f0540a3c8b 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,179 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Wait for the given slot to be invalidated with reason 'xid_aged' +sub wait_for_xid_aged_invalidation +{ + my ($node, $slot_name) = @_; + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) or die "Timed out waiting for slot $slot_name to be invalidated"; +} + +# ===================================================================== +# Testcase start: Invalidate physical slot due to max_slot_xid_age GUC + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Disable autovacuum so checkpointer triggers the invalidation +my $max_slot_xid_age = 100; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); + +$primary5->start; + +# Create a procedure to consume XIDs +$primary5->safe_psql( + 'postgres', qq{ + CREATE PROCEDURE consume_xid(cnt int) + AS \$\$ + DECLARE + i int; + BEGIN + FOR i IN 1..cnt LOOP + EXECUTE 'SELECT pg_current_xact_id()'; + COMMIT; + END LOOP; + END; + \$\$ LANGUAGE plpgsql; +}); + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create standby with HS feedback so the slot gains an xmin +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); +$primary5->wait_for_catchup($standby5); + +# Wait for the physical slot to get xmin via hot_standby_feedback +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin from HS feedback"; + +# Stop standby so the slot becomes inactive with its xmin frozen +$standby5->stop; + +# Advance XIDs past 2x max_slot_xid_age so the slot's xmin is stale enough +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); +$primary5->safe_psql('postgres', "CHECKPOINT"); +wait_for_xid_aged_invalidation($primary5, 'sb5_slot'); +ok(1, "physical slot invalidated due to XID age (via checkpoint)"); + +# Testcase end: Invalidate physical slot due to max_slot_xid_age GUC +# =================================================================== + +# ==================================================================== +# Testcase start: Invalidate logical slot due to max_slot_xid_age GUC + +# Create a logical slot directly on the primary (no subscriber needed). +# The slot gets a catalog_xmin immediately upon creation. +$primary5->safe_psql('postgres', + "SELECT pg_create_logical_replication_slot('lsub5_slot', 'pgoutput')"); + +$primary5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin"; + +# Advance XIDs past 2x max_slot_xid_age so the slot's catalog_xmin is stale enough +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); + +# Vacuume a user table so OldestXmin does not include the slot's catalog_xmin, +# skipping the invalidation of the slot. +$primary5->safe_psql('postgres', "VACUUM tab_int5"); +is( $primary5->safe_psql( + 'postgres', + qq[SELECT invalidation_reason IS NULL FROM pg_replication_slots WHERE slot_name = 'lsub5_slot';] + ), + 't', + 'logical slot not invalidated after vacuuming a data table'); + +# Vacuum a catalog table so OldestXmin includes the slot's catalog_xmin, +# triggering invalidation of the slot. +$primary5->safe_psql('postgres', "VACUUM pg_class"); +wait_for_xid_aged_invalidation($primary5, 'lsub5_slot'); +ok(1, "logical slot invalidated due to XID age (via vacuum)"); + +# Testcase end: Invalidate logical slot due to max_slot_xid_age GUC +# ================================================================== + +# =============================================================================== +# Testcase start: Invalidate logical slot on standby due to max_slot_xid_age GUC + +# Disable max_slot_xid_age on primary and recreate the streaming slot +$primary5->safe_psql( + 'postgres', + q{ +ALTER SYSTEM SET max_slot_xid_age = 0; +SELECT pg_reload_conf(); +}); +$primary5->safe_psql('postgres', + "SELECT pg_drop_replication_slot('sb5_slot')"); +$primary5->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb5_slot', true)"); +$standby5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); +$standby5->start; + +$primary5->wait_for_catchup($standby5); + +$standby5->create_logical_slot_on_standby($primary5, 'sb5_logical_slot', + 'postgres'); + +$standby5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_logical_slot'; +]) or die "Timed out waiting for sb5_logical_slot catalog_xmin"; + +# Advance XIDs on primary, replay on standby, then restartpoint to invalidate +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); +$primary5->safe_psql('postgres', "CHECKPOINT"); +$primary5->wait_for_replay_catchup($standby5); +$standby5->safe_psql('postgres', "CHECKPOINT"); + +wait_for_xid_aged_invalidation($standby5, 'sb5_logical_slot'); +ok(1, "logical (standby) slot invalidated due to XID age (via restartpoint)"); + +$standby5->stop; +$primary5->stop; + +# Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC +# ============================================================================= + done_testing(); -- 2.47.3 [application/octet-stream] v9-0002-Add-more-tests-for-XID-age-slot-invalidation.patch (6.3K, 3-v9-0002-Add-more-tests-for-XID-age-slot-invalidation.patch) download | inline diff: From 3943d750a8883a2d20ae39ca8caa87fe5bf07971 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Fri, 3 Apr 2026 18:55:45 +0000 Subject: [PATCH v9 2/2] Add more tests for XID age slot invalidation Consume XIDs up to wraparound WARNING limits with max_slot_xid_age matching vacuum_failsafe_age (1.6B). Verify that autovacuum invalidates the inactive replication slot (XID-age-based invalidation), unblocks datfrozenxid advancement, and prevents wraparound without any intervention. --- src/test/recovery/Makefile | 3 +- src/test/recovery/t/019_replslot_limit.pl | 130 ++++++++++++++++++++++ 2 files changed, 132 insertions(+), 1 deletion(-) diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile index d41aaaf8ae1..5c3d2c89941 100644 --- a/src/test/recovery/Makefile +++ b/src/test/recovery/Makefile @@ -12,7 +12,8 @@ EXTRA_INSTALL=contrib/pg_prewarm \ contrib/pg_stat_statements \ contrib/test_decoding \ - src/test/modules/injection_points + src/test/modules/injection_points \ + src/test/modules/xid_wraparound subdir = src/test/recovery top_builddir = ../../.. diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 8f0540a3c8b..8ac5f1f699d 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -715,4 +715,134 @@ $primary5->stop; # Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC # ============================================================================= +# ================================================================================= +# Testcase start: XID-age-based slot invalidation with autovacuum (production-like) + +# Standby sets slot xmin via HS feedback, disconnects, XIDs are consumed. +# max_slot_xid_age is set to vacuum_failsafe_age (1.6B) so autovacuum +# invalidates the slot before entering failsafe mode, unblocking +# datfrozenxid advancement and avoiding XID wraparound without manual +# VACUUM or downtime. + +# Verify server log shows slot invalidation by autovacuum worker +sub verify_slot_xid_aged_invalidation_in_server_log +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + my $log = slurp_file($node->logfile); + + # Verify the invalidation was performed by an autovacuum worker + like($log, + qr/autovacuum worker\[\d+\] LOG:\s+invalidating obsolete replication slot "$slot_name"/, + "server log: $slot_name invalidated by autovacuum worker"); + + # Verify DETAIL shows the xmin age exceeding max_slot_xid_age + like($log, + qr/autovacuum worker\[\d+\] DETAIL:\s+The slot's (?:catalog )?xmin age of (\d+) exceeds the configured "max_slot_xid_age" of $max_age by (\d+) transactions/, + "server log: DETAIL shows xmin age exceeds max_slot_xid_age $max_age"); + + # Extract xid age from the log and report for diagnostics + $log =~ + /The slot's (?:catalog )?xmin age of (\d+) exceeds the configured "max_slot_xid_age" of $max_age by (\d+)/; + my $log_xid_age = $1 // 'N/A'; + my $exceeded_by = $2 // 'N/A'; + diag "xid_age from server log=$log_xid_age, exceeded_by=$exceeded_by, max_slot_xid_age=$max_age, consumed=$consumed_xids XIDs"; +} + +# Verify slot invalidation and wait for autovacuum to advance datfrozenxid +sub verify_invalidation_and_recovery +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + return if $max_age == 0; + + wait_for_xid_aged_invalidation($node, $slot_name); + ok(1, 'autovacuum invalidated slot due to xid_aged'); + + verify_slot_xid_aged_invalidation_in_server_log($node, $slot_name, + $max_age, $consumed_xids); + + # Wait for autovacuum to advance datfrozenxid in all databases past the + # wraparound threshold. + $node->poll_query_until( + 'postgres', qq[ + SELECT NOT EXISTS ( + SELECT 1 FROM pg_database + WHERE age(datfrozenxid) > 2000000000 + ); + ]) or die "Timed out waiting for autovacuum to advance datfrozenxid in all databases"; +} + +my $primary6 = PostgreSQL::Test::Cluster->new('primary6'); +$primary6->init(allows_streaming => 'logical'); + +$max_slot_xid_age = 1600000000; # matches vacuum_failsafe_age default +$primary6->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = 1s +}); + +$primary6->start; +$primary6->safe_psql('postgres', "CREATE EXTENSION xid_wraparound"); + +$backup_name = 'backup6'; +$primary6->backup($backup_name); + +my $standby6 = PostgreSQL::Test::Cluster->new('standby6'); +$standby6->init_from_backup($primary6, $backup_name, has_streaming => 1); +$standby6->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb6_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary6->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb6_slot', true)"); + +$standby6->start; + +$primary6->safe_psql('postgres', + "CREATE TABLE tab_int6 AS SELECT generate_series(1,10) AS a"); +$primary6->wait_for_catchup($standby6); + +$primary6->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL FROM pg_replication_slots + WHERE slot_name = 'sb6_slot'; +]) or die "Timed out waiting for sb6_slot xmin from HS feedback"; + +# Stop standby; slot xmin persists and holds back datfrozenxid +$standby6->stop; + +# Consume XIDs in 50M chunks; autovacuum (naptime=1s) will invalidate the +# slot once xmin age exceeds max_slot_xid_age. +my $logstart6 = -s $primary6->logfile; +my $chunk = 50_000_000; +my $max_xids = 2_200_000_000; +my $consumed = 0; + +while ($consumed < $max_xids) +{ + $primary6->safe_psql('postgres', "SELECT consume_xids($chunk)"); + $consumed += $chunk; + my $remaining = $max_xids - $consumed; + diag "consumed $consumed / $max_xids XIDs ($remaining remaining)"; +} + +verify_invalidation_and_recovery($primary6, 'sb6_slot', + $max_slot_xid_age, $consumed); + +# Consume 1B more XIDs — combining with the 2.2B consumed above, the total +# of 3.2B exceeds the 2^31 (~2.1B) usable XID space (xidStopLimit), i.e. +# more than one full wraparound cycle, proving the system is healthy. +$primary6->safe_psql('postgres', "SELECT consume_xids(1000000000)"); +ok(1, 'writes succeed after autovacuum invalidated the slot'); + +$primary6->stop; + +# Testcase end: XID-age-based slot invalidation with autovacuum (production-like) +# ================================================================================ + done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-05 08:03 Masahiko Sawada <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Masahiko Sawada @ 2026-04-05 08:03 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers On Fri, Apr 3, 2026 at 12:05 PM Bharath Rupireddy <[email protected]> wrote: > > Hi, > > On Wed, Apr 1, 2026 at 2:21 PM Bharath Rupireddy > <[email protected]> wrote: > > > > On Wed, Apr 1, 2026 at 12:39 PM Masahiko Sawada <[email protected]> wrote: > > > > > > I've reviewed the v7 patch and have some review comments: > > > > Thank you for reviewing the patch. > > > > I took the above changes into v8 and fixed a typo in using xidLimit > > instead of slotXidLimit. > > > > Please find the attached v8 patches for further review. Thank you! > > Thank you, Sawada-san, for reviewing and providing some offlist comments. > > 1/ Included a note in the docs to say that logical replication slots > are also affected by XID age GUC (similar to > idle_replication_slot_timeout). > > 2/ Added the code to disable the XID age invalidation in > pg_createsubscriber similar to timeout invalidation. Commit 72e6c08fea > ensured that none of the logical replication slots get invalidated > during the upgrade. (I believe the work that pg_upgrade and > pg_createsubscriber do is more important, and the slots created and > used by them or slots in use during those processes must not interfere > with the upgrade or creating a logical replica from a standby.) > > 3/ Changed the max value of XID age GUC to be equal to that of vacuum > failsafe age. In my opinion, the best use of max_slot_xid_age would be > to set it equal to or a little less than vacuum_failsafe_age. Also > added a note in the docs about this. > > 4/ Changed variable names for consistency. > > 5/ Added code to MaybeInvalidateXIDAgedSlots() to skip the slot > invalidation attempt (unnecessary work) when slots are not the reason > for holding back the OldestXmin. Added an equality check to see if > OldestXmin is either OldestSlotXmin or OldestSlotCatalogXmin (all > these OldestXXXXmins are computed from the same ComputeXidHorizons() > call). This should allow us to skip the slot invalidation attempt when > a backend is holding the xmin back (a long-running transaction, for > example). > > Please find the attached v9 patches for further review. Thank you! Thank you for updating the patch! I've made some changes including moving MaybeInvalidateXidAgedSlot() to vacuum.c since the function seems more inherently tied to vacuum context. Also, updated the commit message and fixed typos. Please review the attached patch. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com Attachments: [text/x-patch] v10-0001-Introduce-max_slot_xid_age-to-invalidate-old-rep.patch (36.9K, 2-v10-0001-Introduce-max_slot_xid_age-to-invalidate-old-rep.patch) download | inline diff: From 05b6194a321f4168dcc17b1fd5589dfcfb6e49d4 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Fri, 3 Apr 2026 18:27:15 +0000 Subject: [PATCH v10] Introduce max_slot_xid_age to invalidate old replication slots. This commit introduces a new GUC parameter, max_slot_xid_age. It invalidates replication slots whose xmin or catalog_xmin age exceeds the configured limit. While time-based slot invalidation is useful for cleaning up inactive slots, an XID-age-based limit acts as a critical backstop to directly prevent transaction ID wraparound and severe bloat caused by orphaned slots. The invalidation check occurs during both VACUUM (manual and autovacuum) and checkpoints. During vacuum, the check is performed on a per-relation basis. Crucially, the XID-age based slot invalidation is considered only when slots' XIDs are actively holding back the OldestXmin for the current relation. In other words, we only invalidate slots when doing so has the potential to advance the vacuum cutoff and allow dead tuple reclamation. Checking slot XIDs per relation could introduce significant performance overhead if it required acquiring ProcArrayLock repeatedly to get replication slot xmin values (xmin and catalog_xmin in procArray). To avoid this, the vacuum cutoff calculation has been optimized. A new function GetOldestNonRemovableTransactionIdWithSlotXids() is introduced to return the global OldestXmin along with the oldest slot xmin and catalog_xmin values. This allows the per-relation invalidation check to be performed with zero additional lock acquisitions. In addition to vacuum, slots are also checked and invalidated during checkpoints. This is particularly important for standby servers where vacuum does not run. Note that on standbys, slots that are currently being synced from the primary (i.e., synced = true) are exempt from this invalidation mechanism. Author: Bharath Rupireddy <[email protected]> Co-authored-by: John Hsu <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: shveta malik <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: SATYANARAYANA NARLAPURAM <[email protected]> Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com Discussion: https://postgr.es/m/CA+-JvFsMHckBMzsu5Ov9HCG3AFbMh056hHy1FiXazBRtZ9pFBg@mail.gmail.com --- doc/src/sgml/config.sgml | 54 ++++++ doc/src/sgml/logical-replication.sgml | 4 +- doc/src/sgml/system-views.sgml | 8 + src/backend/access/heap/vacuumlazy.c | 18 ++ src/backend/access/transam/xlog.c | 34 +++- src/backend/commands/vacuum.c | 80 +++++++- src/backend/replication/slot.c | 82 +++++++- src/backend/storage/ipc/procarray.c | 61 ++++-- src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/bin/pg_basebackup/pg_createsubscriber.c | 2 +- src/include/commands/vacuum.h | 10 + src/include/replication/slot.h | 8 +- src/include/storage/procarray.h | 3 + src/test/recovery/t/019_replslot_limit.pl | 175 ++++++++++++++++++ 16 files changed, 524 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index d3fea738ca3..16e45748118 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4764,6 +4764,60 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <structfield>xmin</structfield> age + or <structfield>catalog_xmin</structfield> age in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view has exceeded the age specified by this setting. Slot invalidation + due to this limit occurs during vacuum (both <command>VACUUM</command> + command and autovacuum) and during checkpoint. + A value of zero (the default) disables this feature. Users can set + this value anywhere from zero to two billion transactions. This parameter + can only be set in the <filename>postgresql.conf</filename> file or on + the server command line. + </para> + + <para> + The current age of a slot's <literal>xmin</literal> and + <literal>catalog_xmin</literal> can be monitored by applying the + <function>age</function> function to the corresponding columns in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + + <para> + It is recommended to set <varname>max_slot_xid_age</varname> + to a value equal to or slightly less than + <xref linkend="guc-vacuum-failsafe-age"/>, so that the slot holding the + vacuum back is invalidated before vacuum enters failsafe mode. + </para> + + <para> + Note that this invalidation mechanism is not applicable for slots + on the standby server that are being synced from the primary server + (i.e., standby slots having + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> + value <literal>true</literal>). Synced slots are always considered to + be inactive because they don't perform logical decoding to produce + changes. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml index 23b268273b9..a865a5e6c28 100644 --- a/doc/src/sgml/logical-replication.sgml +++ b/doc/src/sgml/logical-replication.sgml @@ -2649,7 +2649,9 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER <para> Logical replication slots are also affected by - <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>. + <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link> + and + <link linkend="guc-max-slot-xid-age"><varname>max_slot_xid_age</varname></link>. </para> <para> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 9ee1a2bfc6a..1a507b430f9 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 88c71cd85b6..56b231f2350 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -799,6 +799,24 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params, * to increase the number of dead tuples it can prune away.) */ vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + + /* + * If the current vacuum cutoff (OldestXmin) is being held back by a + * replication slot that has exceeded max_slot_xid_age, attempt to + * invalidate such slots. + */ + if (maybe_invalidate_xid_aged_slots(vacrel->cutoffs.OldestXmin, + vacrel->cutoffs.OldestSlotXmin, + vacrel->cutoffs.OldestSlotCatalogXmin)) + { + /* + * Some slots have been invalidated; re-compute the vacuum cutoffs and + * aggressiveness. + */ + vacrel->aggressive = vacuum_get_cutoffs(rel, params, + &vacrel->cutoffs); + } + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 9e8999bbb61..42285c21cb3 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7401,6 +7401,8 @@ CreateCheckPoint(int flags) VirtualTransactionId *vxids; int nvxids; int oldXLogAllowed = 0; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* * An end-of-recovery checkpoint is really a shutdown checkpoint, just @@ -7845,9 +7847,20 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; + if (max_slot_xid_age > 0) + { + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + } + + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -8134,6 +8147,8 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* Concurrent checkpoint/restartpoint cannot happen */ Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); @@ -8312,9 +8327,19 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; + if (max_slot_xid_age > 0) + { + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + } + + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -9230,6 +9255,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 0ed363d1c85..ed647e77615 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -48,6 +48,7 @@ #include "postmaster/autovacuum.h" #include "postmaster/bgworker_internals.h" #include "postmaster/interrupt.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/lmgr.h" #include "storage/pmsignal.h" @@ -1133,7 +1134,10 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, * that only one vacuum process can be working on a particular table at * any time, and that each vacuum is always an independent transaction. */ - cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + cutoffs->OldestXmin = + GetOldestNonRemovableTransactionIdWithSlotXids(rel, + &cutoffs->OldestSlotXmin, + &cutoffs->OldestSlotCatalogXmin); Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); @@ -2688,3 +2692,77 @@ vac_tid_reaped(ItemPointer itemptr, void *state) return TidStoreIsMember(dead_items, itemptr); } + +/* + * Invalidate replication slots whose XID age exceeds max_slot_xid_age. + * + * The caller provides the overall oldest xmin along with the oldest + * slot and catalog_xmin, typically all obtained from a single consistent + * snapshot via ComputeXidHorizons(). These values are used to avoid + * unnecessary work: if the global oldest_xmin is held back by something + * other than a replication slot (e.g., a long-running transaction), + * invalidating slots would not advance the horizon and is therefore + * skipped. Similarly, no action is taken if the current horizons have + * not yet exceeded the threshold. + * + * Returns true if at least one slot was invalidated. + */ +bool +maybe_invalidate_xid_aged_slots(TransactionId oldest_xmin, + TransactionId oldest_slot_xmin, + TransactionId oldest_slot_catalog_xmin) +{ + TransactionId xid_limit; + bool slot_holds_oldest_xmin; + + if (max_slot_xid_age == 0) + return false; + + Assert(TransactionIdIsNormal(oldest_xmin)); + + /* + * Check if a replication slot's xmin or catalog_xmin is what's holding + * back oldest_xmin. If not, skip the unnecessary work. + */ + slot_holds_oldest_xmin = + (TransactionIdIsValid(oldest_slot_xmin) && + TransactionIdEquals(oldest_xmin, oldest_slot_xmin)) || + (TransactionIdIsValid(oldest_slot_catalog_xmin) && + TransactionIdEquals(oldest_xmin, oldest_slot_catalog_xmin)); + + if (!slot_holds_oldest_xmin) + return false; + + xid_limit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + + /* + * A replication slot is holding back oldest_xmin. We invalidate slots + * that have exceeded the XID age limit. + * + * Note that while a non-catalog vacuum is technically only blocked by + * physical slots' xmin values, we invalidate logical slots too that + * exceed the XID age limit if we trigger the XID-age based slot + * invalidation. One might think that this is unnecessary for non-catalog + * tables as invalidating logical slots while vacuuming a non-catalog + * table doesn't help advance vacuum cutoffs. But performing invalidation + * trials for physical and logical slots would add complexity. + * + * In practice, XID-age-based invalidation is lightweight (e.g., it does + * not require process termination). This unified approach keeps the API + * simple by avoiding the need to distinguish between catalog and + * non-catalog tables here. + * + * Note: Invalidating a slot does not guarantee that the oldest xmin will + * advance. Due to a race condition, a long-running transaction might be + * holding the same xmin as the slot. In such cases, the slot is + * invalidated, but the global horizon remains unchanged. + */ + if (TransactionIdPrecedes(oldest_xmin, xid_limit)) + return InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, InvalidOid, + InvalidTransactionId, + xid_limit); + + return false; +} diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a9092fc2382..833ed196128 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -158,6 +159,12 @@ int max_replication_slots = 10; /* the maximum number of replication */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin older + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1780,7 +1787,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId xidLimit) { StringInfoData err_detail; StringInfoData err_hint; @@ -1825,6 +1835,29 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + TransactionId slot_xid = TransactionIdIsValid(xmin) ? xmin : catalog_xmin; + int32 exceeded_by = (int32) (xidLimit - slot_xid); + int32 slot_age = (int32) max_slot_xid_age + exceeded_by; + + /* Either the slot's xmin or catalog_xmin must be valid */ + Assert(TransactionIdIsValid(slot_xid)); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, + TransactionIdIsValid(xmin) + ? _("The slot's xmin age of %d exceeds the configured \"%s\" of %d by %d transactions") + : _("The slot's catalog xmin age of %d exceeds the configured \"%s\" of %d by %d transactions"), + slot_age, "max_slot_xid_age", max_slot_xid_age, exceeded_by); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1863,6 +1896,25 @@ CanInvalidateIdleSlot(ReplicationSlot *s) !(RecoveryInProgress() && s->data.synced)); } +/* + * Can we invalidate an XID-aged replication slot? + * + * XID-aged based invalidation is allowed to the given slot when: + * + * 1. Max XID-age is set + * 2. Slot has valid xmin or catalog_xmin + * 3. The slot is not being synced from the primary while the server is in + * recovery. + */ +static inline bool +CanInvalidateXidAgedSlot(ReplicationSlot *s) +{ + return (max_slot_xid_age != 0 && + (TransactionIdIsValid(s->data.xmin) || + TransactionIdIsValid(s->data.catalog_xmin)) && + !(RecoveryInProgress() && s->data.synced)); +} + /* * DetermineSlotInvalidationCause - Determine the cause for which a slot * becomes invalid among the given possible causes. @@ -1874,6 +1926,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1945,6 +1998,18 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) + { + Assert(TransactionIdIsValid(xidLimit)); + + if ((TransactionIdIsValid(s->data.xmin) && + TransactionIdPrecedes(s->data.xmin, xidLimit)) || + (TransactionIdIsValid(s->data.catalog_xmin) && + TransactionIdPrecedes(s->data.catalog_xmin, xidLimit))) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1967,6 +2032,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2019,6 +2085,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + xidLimit, &inactive_since, now); @@ -2112,7 +2179,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, xidLimit); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2165,7 +2233,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, xidLimit); /* done with this slot for now */ break; @@ -2192,6 +2261,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2205,7 +2276,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, + TransactionId xidLimit) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2244,7 +2316,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + xidLimit, &released_lock)) { Assert(released_lock); diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index cc207cb56e3..898ef4a0833 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1937,6 +1937,31 @@ GlobalVisHorizonKindForRel(Relation rel) return VISHORIZON_TEMP; } +/* + * A helper function to return the appropriate oldest non-removable + * TransactionId from the pre-computed horizons, based on the relation + * type. + */ +static pg_attribute_always_inline TransactionId +GetOldestNonRemovableTransactionIdFromHorizons(ComputeXidHorizonsResult *horizons, + Relation rel) +{ + switch (GlobalVisHorizonKindForRel(rel)) + { + case VISHORIZON_SHARED: + return horizons->shared_oldest_nonremovable; + case VISHORIZON_CATALOG: + return horizons->catalog_oldest_nonremovable; + case VISHORIZON_DATA: + return horizons->data_oldest_nonremovable; + case VISHORIZON_TEMP: + return horizons->temp_oldest_nonremovable; + } + + /* just to prevent compiler warnings */ + return InvalidTransactionId; +} + /* * Return the oldest XID for which deleted tuples must be preserved in the * passed table. @@ -1955,20 +1980,30 @@ GetOldestNonRemovableTransactionId(Relation rel) ComputeXidHorizons(&horizons); - switch (GlobalVisHorizonKindForRel(rel)) - { - case VISHORIZON_SHARED: - return horizons.shared_oldest_nonremovable; - case VISHORIZON_CATALOG: - return horizons.catalog_oldest_nonremovable; - case VISHORIZON_DATA: - return horizons.data_oldest_nonremovable; - case VISHORIZON_TEMP: - return horizons.temp_oldest_nonremovable; - } + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); +} - /* just to prevent compiler warnings */ - return InvalidTransactionId; +/* + * Same as GetOldestNonRemovableTransactionId(), but also returns the + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() + * call. This avoids a separate ProcArrayLock acquisition when the caller + * needs both values. + */ +TransactionId +GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) +{ + ComputeXidHorizonsResult horizons; + + ComputeXidHorizons(&horizons); + + if (slot_xmin) + *slot_xmin = horizons.slot_xmin; + if (slot_catalog_xmin) + *slot_catalog_xmin = horizons.slot_catalog_xmin; + + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); } /* diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index de9092fdf5b..d60f39ec08e 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -504,7 +504,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index a315c4ab8ab..5a438df93d0 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2090,6 +2090,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2100000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 6d0337853e0..1817330484d 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -351,6 +351,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #track_commit_timestamp = off # collect timestamp of transaction commit # (change requires restart) diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c index 37631f700af..9ebc71fcac2 100644 --- a/src/bin/pg_basebackup/pg_createsubscriber.c +++ b/src/bin/pg_basebackup/pg_createsubscriber.c @@ -1901,7 +1901,7 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_ appendPQExpBufferStr(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\""); /* Prevent unintended slot invalidation */ - appendPQExpBufferStr(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\""); + appendPQExpBufferStr(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0 -c max_slot_xid_age=0\""); if (restricted_access) { diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 5b8023616c0..c47558fc6b3 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -287,6 +287,13 @@ struct VacuumCutoffs */ TransactionId FreezeLimit; MultiXactId MultiXactCutoff; + + /* + * Oldest xmin and catalog xmin of any replication slot obtained from the + * same ComputeXidHorizons() call that computed OldestXmin. + */ + TransactionId OldestSlotXmin; + TransactionId OldestSlotCatalogXmin; }; /* @@ -399,6 +406,9 @@ extern IndexBulkDeleteResult *vac_bulkdel_one_index(IndexVacuumInfo *ivinfo, VacDeadItemsInfo *dead_items_info); extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo, IndexBulkDeleteResult *istat); +extern bool maybe_invalidate_xid_aged_slots(TransactionId oldest_xmin, + TransactionId oldest_slot_xmin, + TransactionId oldest_slot_catalog_xmin); /* In postmaster/autovacuum.c */ extern void AutoVacuumUpdateCostLimit(void); diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 4b4709f6e2c..5040d53072b 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot; extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* shmem initialization functions */ extern Size ReplicationSlotsShmemSize(void); @@ -367,7 +370,8 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId xidLimit); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index abdf021e66e..a94091ce7fd 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -53,6 +53,9 @@ extern RunningTransactions GetRunningTransactionData(void); extern bool TransactionIdIsInProgress(TransactionId xid); extern TransactionId GetOldestNonRemovableTransactionId(Relation rel); +extern TransactionId GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern TransactionId GetOldestTransactionIdConsideredRunning(void); extern TransactionId GetOldestActiveTransactionId(bool inCommitOnly, bool allDbs); diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..6b7e4818bf4 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,179 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Wait for the given slot to be invalidated with reason 'xid_aged' +sub wait_for_xid_aged_invalidation +{ + my ($node, $slot_name) = @_; + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) or die "Timed out waiting for slot $slot_name to be invalidated"; +} + +# ===================================================================== +# Testcase start: Invalidate physical slot due to max_slot_xid_age GUC + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Disable autovacuum so checkpointer triggers the invalidation +my $max_slot_xid_age = 100; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); + +$primary5->start; + +# Create a procedure to consume XIDs +$primary5->safe_psql( + 'postgres', qq{ + CREATE PROCEDURE consume_xid(cnt int) + AS \$\$ + DECLARE + i int; + BEGIN + FOR i IN 1..cnt LOOP + EXECUTE 'SELECT pg_current_xact_id()'; + COMMIT; + END LOOP; + END; + \$\$ LANGUAGE plpgsql; +}); + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create standby with HS feedback so the slot gains an xmin +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); +$primary5->wait_for_catchup($standby5); + +# Wait for the physical slot to get xmin via hot_standby_feedback +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin from HS feedback"; + +# Stop standby so the slot becomes inactive with its xmin frozen +$standby5->stop; + +# Advance XIDs past 2x max_slot_xid_age so the slot's xmin is stale enough +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); +$primary5->safe_psql('postgres', "CHECKPOINT"); +wait_for_xid_aged_invalidation($primary5, 'sb5_slot'); +ok(1, "physical slot invalidated due to XID age (via checkpoint)"); + +# Testcase end: Invalidate physical slot due to max_slot_xid_age GUC +# =================================================================== + +# ==================================================================== +# Testcase start: Invalidate logical slot due to max_slot_xid_age GUC + +# Create a logical slot directly on the primary (no subscriber needed). +# The slot gets a catalog_xmin immediately upon creation. +$primary5->safe_psql('postgres', + "SELECT pg_create_logical_replication_slot('lsub5_slot', 'pgoutput')"); + +$primary5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin"; + +# Advance XIDs past 2x max_slot_xid_age so the slot's catalog_xmin is stale enough +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); + +# Vacuum a user table so OldestXmin does not include the slot's catalog_xmin, +# skipping the invalidation of the slot. +$primary5->safe_psql('postgres', "VACUUM tab_int5"); +is( $primary5->safe_psql( + 'postgres', + qq[SELECT invalidation_reason IS NULL FROM pg_replication_slots WHERE slot_name = 'lsub5_slot';] + ), + 't', + 'logical slot not invalidated after vacuuming a data table'); + +# Vacuum a catalog table so OldestXmin includes the slot's catalog_xmin, +# triggering invalidation of the slot. +$primary5->safe_psql('postgres', "VACUUM pg_class"); +wait_for_xid_aged_invalidation($primary5, 'lsub5_slot'); +ok(1, "logical slot invalidated due to XID age (via vacuum)"); + +# Testcase end: Invalidate logical slot due to max_slot_xid_age GUC +# ================================================================== + +# =============================================================================== +# Testcase start: Invalidate logical slot on standby due to max_slot_xid_age GUC + +# Disable max_slot_xid_age on primary and recreate the streaming slot +$primary5->safe_psql( + 'postgres', + q{ +ALTER SYSTEM SET max_slot_xid_age = 0; +SELECT pg_reload_conf(); +}); +$primary5->safe_psql('postgres', + "SELECT pg_drop_replication_slot('sb5_slot')"); +$primary5->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb5_slot', true)"); +$standby5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); +$standby5->start; + +$primary5->wait_for_catchup($standby5); + +$standby5->create_logical_slot_on_standby($primary5, 'sb5_logical_slot', + 'postgres'); + +$standby5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_logical_slot'; +]) or die "Timed out waiting for sb5_logical_slot catalog_xmin"; + +# Advance XIDs on primary, replay on standby, then restartpoint to invalidate +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); +$primary5->safe_psql('postgres', "CHECKPOINT"); +$primary5->wait_for_replay_catchup($standby5); +$standby5->safe_psql('postgres', "CHECKPOINT"); + +wait_for_xid_aged_invalidation($standby5, 'sb5_logical_slot'); +ok(1, "logical (standby) slot invalidated due to XID age (via restartpoint)"); + +$standby5->stop; +$primary5->stop; + +# Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC +# ============================================================================= + done_testing(); -- 2.53.0 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-06 02:52 Bharath Rupireddy <[email protected]> parent: Masahiko Sawada <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-04-06 02:52 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Sun, Apr 5, 2026 at 1:03 AM Masahiko Sawada <[email protected]> wrote: > > Thank you for updating the patch! > > I've made some changes including moving MaybeInvalidateXidAgedSlot() > to vacuum.c since the function seems more inherently tied to vacuum > context. Also, updated the commit message and fixed typos. > > Please review the attached patch. Thank you Sawada-san! I took a look at the v10 patch and it LGTM. I tested it - make check-world passes, pgindent doesn't complain. -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-06 08:44 Masahiko Sawada <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Masahiko Sawada @ 2026-04-06 08:44 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers On Sun, Apr 5, 2026 at 7:52 PM Bharath Rupireddy <[email protected]> wrote: > > Hi, > > On Sun, Apr 5, 2026 at 1:03 AM Masahiko Sawada <[email protected]> wrote: > > > > Thank you for updating the patch! > > > > I've made some changes including moving MaybeInvalidateXidAgedSlot() > > to vacuum.c since the function seems more inherently tied to vacuum > > context. Also, updated the commit message and fixed typos. > > > > Please review the attached patch. > > Thank you Sawada-san! > > I took a look at the v10 patch and it LGTM. I tested it - make > check-world passes, pgindent doesn't complain. > While reviewing the patch, I found that with this patch, backend processes and autovacuum workers can simultaneously attempt to invalidate the same slot for the same reason. When invalidating a slot, we send a signal to the process owning the slot and wait for it to exit and release the slot. If the process takes a long time to exit for some reason, subsequent autovacuum workers attempting to invalidate the same slot will also send a SIGTERM and get stuck at InvalidatePossiblyObsoleteSlot(). In the worst case, this could result in all autovacuum activity being blocked. I think we need to address this problem. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-06 17:42 Bharath Rupireddy <[email protected]> parent: Masahiko Sawada <[email protected]> 0 siblings, 2 replies; 31+ messages in thread From: Bharath Rupireddy @ 2026-04-06 17:42 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Mon, Apr 6, 2026 at 1:45 AM Masahiko Sawada <[email protected]> wrote: > > > I took a look at the v10 patch and it LGTM. I tested it - make > > check-world passes, pgindent doesn't complain. > > While reviewing the patch, I found that with this patch, backend > processes and autovacuum workers can simultaneously attempt to > invalidate the same slot for the same reason. When invalidating a > slot, we send a signal to the process owning the slot and wait for it > to exit and release the slot. If the process takes a long time to exit > for some reason, subsequent autovacuum workers attempting to > invalidate the same slot will also send a SIGTERM and get stuck at > InvalidatePossiblyObsoleteSlot(). In the worst case, this could result > in all autovacuum activity being blocked. I think we need to address > this problem. Thank you! You're right that multiple autovacuum workers can wait on the same slot for SIGTERM to take effect on the process (mainly walsenders) holding the slot. Once the process holding the slot exits, one worker finishes the invalidation and the others see it's done and move on. However, IMHO, this is unlikely to be a problem in practice. First, SIGTERM must take a long time to terminate the process holding the slot. This seems unlikely unless I'm missing some cases. Second, the slot's xmin must be very old (past XID age) while the process is still running but slow to exit. If we set max_slot_xid_age close to vacuum_failsafe_age (e.g., 1.6 billion. I've added this note in the docs), it seems unlikely that the replication connection would still be active at that point. Also, concurrent invalidation can already happen today between the startup process and checkpointer on standby. If needed, we could add a flag to skip extra invalidation attempts based on field experience. Since this feature is off by default, I'd prefer to keep things simple, but I'm open to other approaches. Thoughts? -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-07 14:39 Srinath Reddy Sadipiralla <[email protected]> parent: Bharath Rupireddy <[email protected]> 1 sibling, 0 replies; 31+ messages in thread From: Srinath Reddy Sadipiralla @ 2026-04-07 14:39 UTC (permalink / raw) To: Bharath Rupireddy <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Mon, Apr 6, 2026 at 11:12 PM Bharath Rupireddy < [email protected]> wrote: > Hi, > > On Mon, Apr 6, 2026 at 1:45 AM Masahiko Sawada <[email protected]> > wrote: > > > > > I took a look at the v10 patch and it LGTM. I tested it - make > > > check-world passes, pgindent doesn't complain. > > > > While reviewing the patch, I found that with this patch, backend > > processes and autovacuum workers can simultaneously attempt to > > invalidate the same slot for the same reason. When invalidating a > > slot, we send a signal to the process owning the slot and wait for it > > to exit and release the slot. If the process takes a long time to exit > > for some reason, subsequent autovacuum workers attempting to > > invalidate the same slot will also send a SIGTERM and get stuck at > > InvalidatePossiblyObsoleteSlot(). In the worst case, this could result > > in all autovacuum activity being blocked. I think we need to address > > this problem. > > Thank you! > > You're right that multiple autovacuum workers can wait on the same > slot for SIGTERM to take effect on the process (mainly walsenders) > holding the slot. Once the process holding the slot exits, one worker > finishes the invalidation and the others see it's done and move on. > > However, IMHO, this is unlikely to be a problem in practice. > I was able to reproduce this using pg_recvlogical on a slot, by pausing the walsender using debugger , then i did some hacky stuff around the GUCs (just to test), but in production IIUC I think During decoding a large transaction or network delay , the walsender gets stuck for "some" time, so backend and autovacuum workers get stuck until then, after that they resume their work, Correct me if I am wrong :) If needed, we could add a flag to skip extra invalidation attempts > based on field experience. > +1, yeah this would help other backends or autovacuum workers not to retry again the same invalidation and stuck , instead they can check the flag and be assured that slot invalidation is being taken care of, so others can move on. -- Thanks, Srinath Reddy Sadipiralla EDB: https://www.enterprisedb.com/ ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-16 05:03 Bharath Rupireddy <[email protected]> parent: Bharath Rupireddy <[email protected]> 1 sibling, 1 reply; 31+ messages in thread From: Bharath Rupireddy @ 2026-04-16 05:03 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Mon, Apr 6, 2026 at 10:42 AM Bharath Rupireddy <[email protected]> wrote: > > On Mon, Apr 6, 2026 at 1:45 AM Masahiko Sawada <[email protected]> wrote: > > > > > I took a look at the v10 patch and it LGTM. I tested it - make > > > check-world passes, pgindent doesn't complain. > > > > While reviewing the patch, I found that with this patch, backend > > processes and autovacuum workers can simultaneously attempt to > > invalidate the same slot for the same reason. When invalidating a > > slot, we send a signal to the process owning the slot and wait for it > > to exit and release the slot. If the process takes a long time to exit > > for some reason, subsequent autovacuum workers attempting to > > invalidate the same slot will also send a SIGTERM and get stuck at > > InvalidatePossiblyObsoleteSlot(). In the worst case, this could result > > in all autovacuum activity being blocked. I think we need to address > > this problem. > > Thank you! > > You're right that multiple autovacuum workers can wait on the same > slot for SIGTERM to take effect on the process (mainly walsenders) > holding the slot. Once the process holding the slot exits, one worker > finishes the invalidation and the others see it's done and move on. > > However, IMHO, this is unlikely to be a problem in practice. > > First, SIGTERM must take a long time to terminate the process holding > the slot. This seems unlikely unless I'm missing some cases. > > Second, the slot's xmin must be very old (past XID age) while the > process is still running but slow to exit. If we set max_slot_xid_age > close to vacuum_failsafe_age (e.g., 1.6 billion. I've added this note > in the docs), it seems unlikely that the replication connection would > still be active at that point. > > Also, concurrent invalidation can already happen today between the > startup process and checkpointer on standby. > > If needed, we could add a flag to skip extra invalidation attempts > based on field experience. Since this feature is off by default, I'd > prefer to keep things simple, but I'm open to other approaches. > > Thoughts? Thank you Sawada-san. I've been thinking more about it and I agree we need to address this. While I still think the scenario is unlikely in practice (SIGTERM would have to take a long time, the slot's xmin would have to be very old while the walsender is still running, etc.), I think it's worth handling. I can think of a couple of approaches: 1. Use ConditionVariableTimedSleep instead of ConditionVariableSleep when called from an autovacuum worker. Workers don't block forever, but they still wait for the timeout duration, still send redundant SIGTERMs, and a correct timeout value needs to be chosen. When it expires, the worker either retries (still stuck) or gives up (same as approach 2). 2. Make the vacuum path non-blocking when another process is already invalidating the same slot. The first process to attempt invalidation proceeds normally: it sends SIGTERM and waits on ConditionVariableSleep for the process holding the slot to exit. But if a subsequent autovacuum worker finds that another process has already initiated invalidation of this slot, it skips the slot and proceeds with vacuum instead of waiting on the same ConditionVariableSleep. I think approach 2 is simple. If another process is already invalidating the slot, there's no reason for the autovacuum worker to also block. The tradeoff is that this vacuum cycle's OldestXmin won't move forward and it will need another cycle for this relation. But that's fine given that the scenario as explained above is unlikely to happen in practice. Please let me know if my thinking sounds reasonable. I'm open to other ideas too. Thoughts? -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Introduce XID age based replication slot invalidation @ 2026-04-23 18:11 Bharath Rupireddy <[email protected]> parent: Bharath Rupireddy <[email protected]> 0 siblings, 0 replies; 31+ messages in thread From: Bharath Rupireddy @ 2026-04-23 18:11 UTC (permalink / raw) To: Masahiko Sawada <[email protected]>; +Cc: Srinath Reddy Sadipiralla <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; Hayato Kuroda (Fujitsu) <[email protected]>; John H <[email protected]>; pgsql-hackers Hi, On Wed, Apr 15, 2026 at 10:03 PM Bharath Rupireddy <[email protected]> wrote: > > Hi, > > On Mon, Apr 6, 2026 at 10:42 AM Bharath Rupireddy > <[email protected]> wrote: > > > > On Mon, Apr 6, 2026 at 1:45 AM Masahiko Sawada <[email protected]> wrote: > > > > > > > I took a look at the v10 patch and it LGTM. I tested it - make > > > > check-world passes, pgindent doesn't complain. > > > > > > While reviewing the patch, I found that with this patch, backend > > > processes and autovacuum workers can simultaneously attempt to > > > invalidate the same slot for the same reason. When invalidating a > > > slot, we send a signal to the process owning the slot and wait for it > > > to exit and release the slot. If the process takes a long time to exit > > > for some reason, subsequent autovacuum workers attempting to > > > invalidate the same slot will also send a SIGTERM and get stuck at > > > InvalidatePossiblyObsoleteSlot(). In the worst case, this could result > > > in all autovacuum activity being blocked. I think we need to address > > > this problem. > > > > Thank you! > > > > You're right that multiple autovacuum workers can wait on the same > > slot for SIGTERM to take effect on the process (mainly walsenders) > > holding the slot. Once the process holding the slot exits, one worker > > finishes the invalidation and the others see it's done and move on. > > > > However, IMHO, this is unlikely to be a problem in practice. > > > > First, SIGTERM must take a long time to terminate the process holding > > the slot. This seems unlikely unless I'm missing some cases. > > > > Second, the slot's xmin must be very old (past XID age) while the > > process is still running but slow to exit. If we set max_slot_xid_age > > close to vacuum_failsafe_age (e.g., 1.6 billion. I've added this note > > in the docs), it seems unlikely that the replication connection would > > still be active at that point. > > > > Also, concurrent invalidation can already happen today between the > > startup process and checkpointer on standby. > > > > If needed, we could add a flag to skip extra invalidation attempts > > based on field experience. Since this feature is off by default, I'd > > prefer to keep things simple, but I'm open to other approaches. > > > > Thoughts? > > Thank you Sawada-san. I've been thinking more about it and I agree we > need to address this. While I still think the scenario is unlikely in > practice (SIGTERM would have to take a long time, the slot's xmin > would have to be very old while the walsender is still running, etc.), > I think it's worth handling. > > I can think of a couple of approaches: > > 1. Use ConditionVariableTimedSleep instead of ConditionVariableSleep > when called from an autovacuum worker. Workers don't block forever, > but they still wait for the timeout duration, still send redundant > SIGTERMs, and a correct timeout value needs to be chosen. When it > expires, the worker either retries (still stuck) or gives up (same as > approach 2). > > 2. Make the vacuum path non-blocking when another process is already > invalidating the same slot. The first process to attempt invalidation > proceeds normally: it sends SIGTERM and waits on > ConditionVariableSleep for the process holding the slot to exit. But > if a subsequent autovacuum worker finds that another process has > already initiated invalidation of this slot, it skips the slot and > proceeds with vacuum instead of waiting on the same > ConditionVariableSleep. > > I think approach 2 is simple. If another process is already > invalidating the slot, there's no reason for the autovacuum worker to > also block. The tradeoff is that this vacuum cycle's OldestXmin won't > move forward and it will need another cycle for this relation. But > that's fine given that the scenario as explained above is unlikely to > happen in practice. > > Please let me know if my thinking sounds reasonable. I'm open to other > ideas too. > > Thoughts? I implemented the approach 2 (patch 0003). I added an injection point to mimic the walsender taking time to process SIGTERM, so that the process invalidating the slot waits on the slot's CV. Please have a look and share your thoughts. Thank you! -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com Attachments: [application/x-patch] v11-0001-Introduce-max_slot_xid_age-to-invalidate-old-rep.patch (38.7K, 2-v11-0001-Introduce-max_slot_xid_age-to-invalidate-old-rep.patch) download | inline diff: From 0d879d4dc2812ed5c071d53a06d936b09bd62957 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Fri, 17 Apr 2026 18:29:41 +0000 Subject: [PATCH v11 1/3] Introduce max_slot_xid_age to invalidate old replication slots This commit introduces a new GUC parameter, max_slot_xid_age. It invalidates replication slots whose xmin or catalog_xmin age exceeds the configured limit. While time-based slot invalidation is useful for cleaning up inactive slots, an XID-age-based limit acts as a critical backstop to directly prevent transaction ID wraparound and severe bloat caused by orphaned slots. The invalidation check occurs during both VACUUM (manual and autovacuum) and checkpoints. During vacuum, the check is performed on a per-relation basis. Crucially, the XID-age based slot invalidation is considered only when slots' XIDs are actively holding back the OldestXmin for the current relation. In other words, we only invalidate slots when doing so has the potential to advance the vacuum cutoff and allow dead tuple reclamation. Checking slot XIDs per relation could introduce significant performance overhead if it required acquiring ProcArrayLock repeatedly to get replication slot xmin values (xmin and catalog_xmin in procArray). To avoid this, the vacuum cutoff calculation has been optimized. A new function GetOldestNonRemovableTransactionIdWithSlotXids() is introduced to return the global OldestXmin along with the oldest slot xmin and catalog_xmin values. This allows the per-relation invalidation check to be performed with zero additional lock acquisitions. In addition to vacuum, slots are also checked and invalidated during checkpoints. This is particularly important for standby servers where vacuum does not run. Note that on standbys, slots that are currently being synced from the primary (i.e., synced = true) are exempt from this invalidation mechanism. Author: Bharath Rupireddy <[email protected]> Co-authored-by: John Hsu <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: shveta malik <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: SATYANARAYANA NARLAPURAM <[email protected]> Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com Discussion: https://postgr.es/m/CA+-JvFsMHckBMzsu5Ov9HCG3AFbMh056hHy1FiXazBRtZ9pFBg@mail.gmail.com --- doc/src/sgml/config.sgml | 54 ++++++ doc/src/sgml/logical-replication.sgml | 4 +- doc/src/sgml/system-views.sgml | 8 + src/backend/access/heap/vacuumlazy.c | 18 ++ src/backend/access/transam/xlog.c | 34 +++- src/backend/commands/vacuum.c | 80 +++++++- src/backend/commands/vacuumparallel.c | 26 +++ src/backend/replication/slot.c | 82 +++++++- src/backend/storage/ipc/procarray.c | 61 ++++-- src/backend/storage/ipc/standby.c | 3 +- src/backend/utils/misc/guc_parameters.dat | 8 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/bin/pg_basebackup/pg_createsubscriber.c | 2 +- src/include/commands/vacuum.h | 10 + src/include/replication/slot.h | 8 +- src/include/storage/procarray.h | 3 + src/test/recovery/t/019_replslot_limit.pl | 175 ++++++++++++++++++ 17 files changed, 550 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 67da9a1de66..ff24e259a43 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4948,6 +4948,60 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age"> + <term><varname>max_slot_xid_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_slot_xid_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Invalidate replication slots whose <structfield>xmin</structfield> age + or <structfield>catalog_xmin</structfield> age in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view has exceeded the age specified by this setting. Slot invalidation + due to this limit occurs during vacuum (both <command>VACUUM</command> + command and autovacuum) and during checkpoint. + A value of zero (the default) disables this feature. Users can set + this value anywhere from zero to two billion transactions. This parameter + can only be set in the <filename>postgresql.conf</filename> file or on + the server command line. + </para> + + <para> + The current age of a slot's <literal>xmin</literal> and + <literal>catalog_xmin</literal> can be monitored by applying the + <function>age</function> function to the corresponding columns in the + <link linkend="view-pg-replication-slots">pg_replication_slots</link> + view. + </para> + + <para> + Idle or forgotten replication slots can hold back vacuum, leading to + bloat and eventually transaction ID wraparound. This setting avoids + that by invalidating slots that have fallen too far behind. + See <xref linkend="routine-vacuuming"/> for more details. + </para> + + <para> + It is recommended to set <varname>max_slot_xid_age</varname> + to a value equal to or slightly less than + <xref linkend="guc-vacuum-failsafe-age"/>, so that the slot holding the + vacuum back is invalidated before vacuum enters failsafe mode. + </para> + + <para> + Note that this invalidation mechanism is not applicable for slots + on the standby server that are being synced from the primary server + (i.e., standby slots having + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> + value <literal>true</literal>). Synced slots are always considered to + be inactive because they don't perform logical decoding to produce + changes. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout"> <term><varname>wal_sender_timeout</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml index 598e23ad4f5..b20f2133d99 100644 --- a/doc/src/sgml/logical-replication.sgml +++ b/doc/src/sgml/logical-replication.sgml @@ -2649,7 +2649,9 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER <para> Logical replication slots are also affected by - <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>. + <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link> + and + <link linkend="guc-max-slot-xid-age"><varname>max_slot_xid_age</varname></link>. </para> <para> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 2ebec6928d5..5862f9841c0 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx <xref linkend="guc-idle-replication-slot-timeout"/> duration. </para> </listitem> + <listitem> + <para> + <literal>xid_aged</literal> means that the slot's + <literal>xmin</literal> or <literal>catalog_xmin</literal> + has reached the age specified by + <xref linkend="guc-max-slot-xid-age"/> parameter. + </para> + </listitem> </itemizedlist> </para></entry> </row> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 39395aed0d5..19b107a9b70 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -800,6 +800,24 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params, * to increase the number of dead tuples it can prune away.) */ vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); + + /* + * If the current vacuum cutoff (OldestXmin) is being held back by a + * replication slot that has exceeded max_slot_xid_age, attempt to + * invalidate such slots. + */ + if (maybe_invalidate_xid_aged_slots(vacrel->cutoffs.OldestXmin, + vacrel->cutoffs.OldestSlotXmin, + vacrel->cutoffs.OldestSlotCatalogXmin)) + { + /* + * Some slots have been invalidated; re-compute the vacuum cutoffs and + * aggressiveness. + */ + vacrel->aggressive = vacuum_get_cutoffs(rel, params, + &vacrel->cutoffs); + } + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f85b5286086..823ebadab62 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7405,6 +7405,8 @@ CreateCheckPoint(int flags) VirtualTransactionId *vxids; int nvxids; int oldXLogAllowed = 0; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* * An end-of-recovery checkpoint is really a shutdown checkpoint, just @@ -7849,9 +7851,20 @@ CreateCheckPoint(int flags) */ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; + if (max_slot_xid_age > 0) + { + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + } + + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -8138,6 +8151,8 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + uint32 slotInvalidationCauses; + TransactionId slotXidLimit; /* Concurrent checkpoint/restartpoint cannot happen */ Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); @@ -8316,9 +8331,19 @@ CreateRestartPoint(int flags) INJECTION_POINT("restartpoint-before-slot-invalidation", NULL); - if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT, + slotInvalidationCauses = RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT; + slotXidLimit = InvalidTransactionId; + if (max_slot_xid_age > 0) + { + slotInvalidationCauses |= RS_INVAL_XID_AGE; + slotXidLimit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + } + + if (InvalidateObsoleteReplicationSlots(slotInvalidationCauses, _logSegNo, InvalidOid, - InvalidTransactionId)) + InvalidTransactionId, + slotXidLimit)) { /* * Some slots have been invalidated; recalculate the old-segment @@ -9234,6 +9259,7 @@ xlog_redo(XLogReaderState *record) */ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, + InvalidTransactionId, InvalidTransactionId); } else if (sync_replication_slots) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 99d0db82ed7..73439a163ef 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -48,6 +48,7 @@ #include "postmaster/autovacuum.h" #include "postmaster/bgworker_internals.h" #include "postmaster/interrupt.h" +#include "replication/slot.h" #include "storage/bufmgr.h" #include "storage/lmgr.h" #include "storage/pmsignal.h" @@ -1134,7 +1135,10 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params, * that only one vacuum process can be working on a particular table at * any time, and that each vacuum is always an independent transaction. */ - cutoffs->OldestXmin = GetOldestNonRemovableTransactionId(rel); + cutoffs->OldestXmin = + GetOldestNonRemovableTransactionIdWithSlotXids(rel, + &cutoffs->OldestSlotXmin, + &cutoffs->OldestSlotCatalogXmin); Assert(TransactionIdIsNormal(cutoffs->OldestXmin)); @@ -2708,3 +2712,77 @@ vac_tid_reaped(ItemPointer itemptr, void *state) return TidStoreIsMember(dead_items, itemptr); } + +/* + * Invalidate replication slots whose XID age exceeds max_slot_xid_age. + * + * The caller provides the overall oldest xmin along with the oldest + * slot and catalog_xmin, typically all obtained from a single consistent + * snapshot via ComputeXidHorizons(). These values are used to avoid + * unnecessary work: if the global oldest_xmin is held back by something + * other than a replication slot (e.g., a long-running transaction), + * invalidating slots would not advance the horizon and is therefore + * skipped. Similarly, no action is taken if the current horizons have + * not yet exceeded the threshold. + * + * Returns true if at least one slot was invalidated. + */ +bool +maybe_invalidate_xid_aged_slots(TransactionId oldest_xmin, + TransactionId oldest_slot_xmin, + TransactionId oldest_slot_catalog_xmin) +{ + TransactionId xid_limit; + bool slot_holds_oldest_xmin; + + if (max_slot_xid_age == 0) + return false; + + Assert(TransactionIdIsNormal(oldest_xmin)); + + /* + * Check if a replication slot's xmin or catalog_xmin is what's holding + * back oldest_xmin. If not, skip the unnecessary work. + */ + slot_holds_oldest_xmin = + (TransactionIdIsValid(oldest_slot_xmin) && + TransactionIdEquals(oldest_xmin, oldest_slot_xmin)) || + (TransactionIdIsValid(oldest_slot_catalog_xmin) && + TransactionIdEquals(oldest_xmin, oldest_slot_catalog_xmin)); + + if (!slot_holds_oldest_xmin) + return false; + + xid_limit = TransactionIdRetreatedBy(ReadNextTransactionId(), + max_slot_xid_age); + + /* + * A replication slot is holding back oldest_xmin. We invalidate slots + * that have exceeded the XID age limit. + * + * Note that while a non-catalog vacuum is technically only blocked by + * physical slots' xmin values, we invalidate logical slots too that + * exceed the XID age limit if we trigger the XID-age based slot + * invalidation. One might think that this is unnecessary for non-catalog + * tables as invalidating logical slots while vacuuming a non-catalog + * table doesn't help advance vacuum cutoffs. But performing invalidation + * trials for physical and logical slots would add complexity. + * + * In practice, XID-age-based invalidation is lightweight (e.g., it does + * not require process termination). This unified approach keeps the API + * simple by avoiding the need to distinguish between catalog and + * non-catalog tables here. + * + * Note: Invalidating a slot does not guarantee that the oldest xmin will + * advance. Due to a race condition, a long-running transaction might be + * holding the same xmin as the slot. In such cases, the slot is + * invalidated, but the global horizon remains unchanged. + */ + if (TransactionIdPrecedes(oldest_xmin, xid_limit)) + return InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, + 0, InvalidOid, + InvalidTransactionId, + xid_limit); + + return false; +} diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c index 979c2be4abd..deffc537269 100644 --- a/src/backend/commands/vacuumparallel.c +++ b/src/backend/commands/vacuumparallel.c @@ -47,6 +47,7 @@ #include "storage/proc.h" #include "tcop/tcopprot.h" #include "utils/lsyscache.h" +#include "utils/ps_status.h" #include "utils/rel.h" /* @@ -1097,6 +1098,28 @@ parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel, pvs->indname = pstrdup(RelationGetRelationName(indrel)); pvs->status = indstats->status; + /* + * Update the ps display to show which index this worker is currently + * processing, along with the table and index OIDs. This makes it easy + * to identify which index a parallel vacuum worker is stuck on via + * "ps -ef". For example: + * "parallel worker for PID 12345: vacuuming index idx_foo (table OID 16384, index OID 16385)" + */ + { + char ps_suffix[128]; + const char *phase; + + phase = (indstats->status == PARALLEL_INDVAC_STATUS_NEED_BULKDELETE) + ? "vacuuming" : "cleaning up"; + snprintf(ps_suffix, sizeof(ps_suffix), + ": %s index \"%s\" (table OID %u, index OID %u)", + phase, + RelationGetRelationName(indrel), + RelationGetRelid(pvs->heaprel), + RelationGetRelid(indrel)); + set_ps_display_suffix(ps_suffix); + } + switch (indstats->status) { case PARALLEL_INDVAC_STATUS_NEED_BULKDELETE: @@ -1144,6 +1167,9 @@ parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel, pfree(pvs->indname); pvs->indname = NULL; + /* Clear the ps display suffix now that this index is done */ + set_ps_display_suffix(""); + /* * Call the parallel variant of pgstat_progress_incr_param so workers can * report progress of index vacuum to the leader. diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 83fcde74718..fe43f5c8820 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -118,6 +118,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = { {RS_INVAL_HORIZON, "rows_removed"}, {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"}, {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"}, + {RS_INVAL_XID_AGE, "xid_aged"}, }; /* @@ -169,6 +170,12 @@ int max_repack_replication_slots = 5; /* the maximum number of slots */ int idle_replication_slot_timeout_secs = 0; +/* + * Invalidate replication slots that have xmin or catalog_xmin older + * than the specified age; '0' disables it. + */ +int max_slot_xid_age = 0; + /* * This GUC lists streaming replication standby server slot names that * logical WAL sender processes will wait for. @@ -1794,7 +1801,10 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, XLogRecPtr restart_lsn, XLogRecPtr oldestLSN, TransactionId snapshotConflictHorizon, - long slot_idle_seconds) + long slot_idle_seconds, + TransactionId xmin, + TransactionId catalog_xmin, + TransactionId xidLimit) { StringInfoData err_detail; StringInfoData err_hint; @@ -1839,6 +1849,29 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause, "idle_replication_slot_timeout"); break; } + + case RS_INVAL_XID_AGE: + { + TransactionId slot_xid = TransactionIdIsValid(xmin) ? xmin : catalog_xmin; + int32 exceeded_by = (int32) (xidLimit - slot_xid); + int32 slot_age = (int32) max_slot_xid_age + exceeded_by; + + /* Either the slot's xmin or catalog_xmin must be valid */ + Assert(TransactionIdIsValid(slot_xid)); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, + TransactionIdIsValid(xmin) + ? _("The slot's xmin age of %d exceeds the configured \"%s\" of %d by %d transactions") + : _("The slot's catalog xmin age of %d exceeds the configured \"%s\" of %d by %d transactions"), + slot_age, "max_slot_xid_age", max_slot_xid_age, exceeded_by); + + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_hint, _("You might need to increase \"%s\"."), + "max_slot_xid_age"); + break; + } + case RS_INVAL_NONE: pg_unreachable(); } @@ -1877,6 +1910,25 @@ CanInvalidateIdleSlot(ReplicationSlot *s) !(RecoveryInProgress() && s->data.synced)); } +/* + * Can we invalidate an XID-aged replication slot? + * + * XID-aged based invalidation is allowed to the given slot when: + * + * 1. Max XID-age is set + * 2. Slot has valid xmin or catalog_xmin + * 3. The slot is not being synced from the primary while the server is in + * recovery. + */ +static inline bool +CanInvalidateXidAgedSlot(ReplicationSlot *s) +{ + return (max_slot_xid_age != 0 && + (TransactionIdIsValid(s->data.xmin) || + TransactionIdIsValid(s->data.catalog_xmin)) && + !(RecoveryInProgress() && s->data.synced)); +} + /* * DetermineSlotInvalidationCause - Determine the cause for which a slot * becomes invalid among the given possible causes. @@ -1888,6 +1940,7 @@ static ReplicationSlotInvalidationCause DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, TimestampTz *inactive_since, TimestampTz now) { Assert(possible_causes != RS_INVAL_NONE); @@ -1959,6 +2012,18 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, } } + /* Check if the slot needs to be invalidated due to max_slot_xid_age GUC */ + if ((possible_causes & RS_INVAL_XID_AGE) && CanInvalidateXidAgedSlot(s)) + { + Assert(TransactionIdIsValid(xidLimit)); + + if ((TransactionIdIsValid(s->data.xmin) && + TransactionIdPrecedes(s->data.xmin, xidLimit)) || + (TransactionIdIsValid(s->data.catalog_xmin) && + TransactionIdPrecedes(s->data.catalog_xmin, xidLimit))) + return RS_INVAL_XID_AGE; + } + return RS_INVAL_NONE; } @@ -1981,6 +2046,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReplicationSlot *s, XLogRecPtr oldestLSN, Oid dboid, TransactionId snapshotConflictHorizon, + TransactionId xidLimit, bool *released_lock_out) { int last_signaled_pid = 0; @@ -2033,6 +2099,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, + xidLimit, &inactive_since, now); @@ -2126,7 +2193,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, true, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, xidLimit); if (MyBackendType == B_STARTUP) (void) SignalRecoveryConflict(GetPGProcByNumber(active_proc), @@ -2179,7 +2247,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, ReportSlotInvalidation(invalidation_cause, false, active_pid, slotname, restart_lsn, oldestLSN, snapshotConflictHorizon, - slot_idle_secs); + slot_idle_secs, s->data.xmin, + s->data.catalog_xmin, xidLimit); /* done with this slot for now */ break; @@ -2206,6 +2275,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * logical. * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured * "idle_replication_slot_timeout" duration. + * - RS_INVAL_XID_AGE: slot xid age is older than the configured + * "max_slot_xid_age" age. * * Note: This function attempts to invalidate the slot for multiple possible * causes in a single pass, minimizing redundant iterations. The "cause" @@ -2219,7 +2290,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon) + TransactionId snapshotConflictHorizon, + TransactionId xidLimit) { XLogRecPtr oldestLSN; bool invalidated = false; @@ -2258,7 +2330,7 @@ restart: if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid, snapshotConflictHorizon, - &released_lock)) + xidLimit, &released_lock)) { Assert(released_lock); diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 9299bcebbda..a352fd7ff3a 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -1929,6 +1929,31 @@ GlobalVisHorizonKindForRel(Relation rel) return VISHORIZON_TEMP; } +/* + * A helper function to return the appropriate oldest non-removable + * TransactionId from the pre-computed horizons, based on the relation + * type. + */ +static pg_attribute_always_inline TransactionId +GetOldestNonRemovableTransactionIdFromHorizons(ComputeXidHorizonsResult *horizons, + Relation rel) +{ + switch (GlobalVisHorizonKindForRel(rel)) + { + case VISHORIZON_SHARED: + return horizons->shared_oldest_nonremovable; + case VISHORIZON_CATALOG: + return horizons->catalog_oldest_nonremovable; + case VISHORIZON_DATA: + return horizons->data_oldest_nonremovable; + case VISHORIZON_TEMP: + return horizons->temp_oldest_nonremovable; + } + + /* just to prevent compiler warnings */ + return InvalidTransactionId; +} + /* * Return the oldest XID for which deleted tuples must be preserved in the * passed table. @@ -1947,20 +1972,30 @@ GetOldestNonRemovableTransactionId(Relation rel) ComputeXidHorizons(&horizons); - switch (GlobalVisHorizonKindForRel(rel)) - { - case VISHORIZON_SHARED: - return horizons.shared_oldest_nonremovable; - case VISHORIZON_CATALOG: - return horizons.catalog_oldest_nonremovable; - case VISHORIZON_DATA: - return horizons.data_oldest_nonremovable; - case VISHORIZON_TEMP: - return horizons.temp_oldest_nonremovable; - } + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); +} - /* just to prevent compiler warnings */ - return InvalidTransactionId; +/* + * Same as GetOldestNonRemovableTransactionId(), but also returns the + * replication slot xmin and catalog_xmin from the same ComputeXidHorizons() + * call. This avoids a separate ProcArrayLock acquisition when the caller + * needs both values. + */ +TransactionId +GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin) +{ + ComputeXidHorizonsResult horizons; + + ComputeXidHorizons(&horizons); + + if (slot_xmin) + *slot_xmin = horizons.slot_xmin; + if (slot_catalog_xmin) + *slot_catalog_xmin = horizons.slot_catalog_xmin; + + return GetOldestNonRemovableTransactionIdFromHorizons(&horizons, rel); } /* diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c index 29af7733948..d54d6cc7544 100644 --- a/src/backend/storage/ipc/standby.c +++ b/src/backend/storage/ipc/standby.c @@ -504,7 +504,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon, */ if (IsLogicalDecodingEnabled() && isCatalogRel) InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid, - snapshotConflictHorizon); + snapshotConflictHorizon, + InvalidTransactionId); } /* diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 83af594d4af..4738c84c6c5 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -2132,6 +2132,14 @@ max => 'MAX_KILOBYTES', }, +{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING', + short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.', + variable => 'max_slot_xid_age', + boot_val => '0', + min => '0', + max => '2100000000', +}, + # We use the hopefully-safely-small value of 100kB as the compiled-in # default for max_stack_depth. InitializeGUCOptions will increase it # if possible, depending on the actual platform-specific stack limit. diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index ac38cddaaf9..358b6edc9f1 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -361,6 +361,8 @@ #wal_keep_size = 0 # in megabytes; 0 disables #max_slot_wal_keep_size = -1 # in megabytes; -1 disables #idle_replication_slot_timeout = 0 # in seconds; 0 disables +#max_slot_xid_age = 0 # maximum XID age before a replication slot + # gets invalidated; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #wal_sender_shutdown_timeout = -1 # in milliseconds; -1 disables #track_commit_timestamp = off # collect timestamp of transaction commit diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c index 15e06e5686e..f75dc79fc03 100644 --- a/src/bin/pg_basebackup/pg_createsubscriber.c +++ b/src/bin/pg_basebackup/pg_createsubscriber.c @@ -1678,7 +1678,7 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_ appendPQExpBufferStr(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\""); /* Prevent unintended slot invalidation */ - appendPQExpBufferStr(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\""); + appendPQExpBufferStr(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0 -c max_slot_xid_age=0\""); if (restricted_access) { diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 956d9cea36d..7d81e3d1906 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -287,6 +287,13 @@ struct VacuumCutoffs */ TransactionId FreezeLimit; MultiXactId MultiXactCutoff; + + /* + * Oldest xmin and catalog xmin of any replication slot obtained from the + * same ComputeXidHorizons() call that computed OldestXmin. + */ + TransactionId OldestSlotXmin; + TransactionId OldestSlotCatalogXmin; }; /* @@ -399,6 +406,9 @@ extern IndexBulkDeleteResult *vac_bulkdel_one_index(IndexVacuumInfo *ivinfo, VacDeadItemsInfo *dead_items_info); extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo, IndexBulkDeleteResult *istat); +extern bool maybe_invalidate_xid_aged_slots(TransactionId oldest_xmin, + TransactionId oldest_slot_xmin, + TransactionId oldest_slot_catalog_xmin); /* In postmaster/autovacuum.c */ extern void AutoVacuumUpdateCostLimit(void); diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index 77c8d0975b6..cad1d89b05b 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause RS_INVAL_WAL_LEVEL = (1 << 2), /* idle slot timeout has occurred */ RS_INVAL_IDLE_TIMEOUT = (1 << 3), + /* slot's xmin or catalog_xmin has reached max xid age */ + RS_INVAL_XID_AGE = (1 << 4), } ReplicationSlotInvalidationCause; /* Maximum number of invalidation causes */ -#define RS_INVAL_MAX_CAUSES 4 +#define RS_INVAL_MAX_CAUSES 5 /* * When the slot synchronization worker is running, or when @@ -327,6 +329,7 @@ extern PGDLLIMPORT int max_replication_slots; extern PGDLLIMPORT int max_repack_replication_slots; extern PGDLLIMPORT char *synchronized_standby_slots; extern PGDLLIMPORT int idle_replication_slot_timeout_secs; +extern PGDLLIMPORT int max_slot_xid_age; /* management of individual slots */ extern void ReplicationSlotCreate(const char *name, bool db_specific, @@ -364,7 +367,8 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid); extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes, XLogSegNo oldestSegno, Oid dboid, - TransactionId snapshotConflictHorizon); + TransactionId snapshotConflictHorizon, + TransactionId xidLimit); extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock); extern int ReplicationSlotIndex(ReplicationSlot *slot); extern bool ReplicationSlotName(int index, Name name); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index ec89c448220..b1dee4ad889 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -51,6 +51,9 @@ extern RunningTransactions GetRunningTransactionData(Oid dbid); extern bool TransactionIdIsInProgress(TransactionId xid); extern TransactionId GetOldestNonRemovableTransactionId(Relation rel); +extern TransactionId GetOldestNonRemovableTransactionIdWithSlotXids(Relation rel, + TransactionId *slot_xmin, + TransactionId *slot_catalog_xmin); extern TransactionId GetOldestTransactionIdConsideredRunning(void); extern TransactionId GetOldestActiveTransactionId(bool inCommitOnly, bool allDbs); diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7b253e64d9c..6b7e4818bf4 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -540,4 +540,179 @@ is( $publisher4->safe_psql( $publisher4->stop; $subscriber4->stop; +# Wait for the given slot to be invalidated with reason 'xid_aged' +sub wait_for_xid_aged_invalidation +{ + my ($node, $slot_name) = @_; + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + active = false AND + invalidation_reason = 'xid_aged'; + ]) or die "Timed out waiting for slot $slot_name to be invalidated"; +} + +# ===================================================================== +# Testcase start: Invalidate physical slot due to max_slot_xid_age GUC + +# Initialize primary node for XID age tests +my $primary5 = PostgreSQL::Test::Cluster->new('primary5'); +$primary5->init(allows_streaming => 'logical'); + +# Disable autovacuum so checkpointer triggers the invalidation +my $max_slot_xid_age = 100; +$primary5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); + +$primary5->start; + +# Create a procedure to consume XIDs +$primary5->safe_psql( + 'postgres', qq{ + CREATE PROCEDURE consume_xid(cnt int) + AS \$\$ + DECLARE + i int; + BEGIN + FOR i IN 1..cnt LOOP + EXECUTE 'SELECT pg_current_xact_id()'; + COMMIT; + END LOOP; + END; + \$\$ LANGUAGE plpgsql; +}); + +# Take a backup for creating standby +$backup_name = 'backup5'; +$primary5->backup($backup_name); + +# Create standby with HS feedback so the slot gains an xmin +my $standby5 = PostgreSQL::Test::Cluster->new('standby5'); +$standby5->init_from_backup($primary5, $backup_name, has_streaming => 1); +$standby5->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb5_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); +$primary5->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb5_slot', immediately_reserve := true); +]); +$standby5->start; + +# Create some content on primary to move xmin +$primary5->safe_psql('postgres', + "CREATE TABLE tab_int5 AS SELECT generate_series(1,10) AS a"); +$primary5->wait_for_catchup($standby5); + +# Wait for the physical slot to get xmin via hot_standby_feedback +$primary5->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_slot'; +]) or die "Timed out waiting for slot sb5_slot xmin from HS feedback"; + +# Stop standby so the slot becomes inactive with its xmin frozen +$standby5->stop; + +# Advance XIDs past 2x max_slot_xid_age so the slot's xmin is stale enough +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); +$primary5->safe_psql('postgres', "CHECKPOINT"); +wait_for_xid_aged_invalidation($primary5, 'sb5_slot'); +ok(1, "physical slot invalidated due to XID age (via checkpoint)"); + +# Testcase end: Invalidate physical slot due to max_slot_xid_age GUC +# =================================================================== + +# ==================================================================== +# Testcase start: Invalidate logical slot due to max_slot_xid_age GUC + +# Create a logical slot directly on the primary (no subscriber needed). +# The slot gets a catalog_xmin immediately upon creation. +$primary5->safe_psql('postgres', + "SELECT pg_create_logical_replication_slot('lsub5_slot', 'pgoutput')"); + +$primary5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'lsub5_slot'; +]) or die "Timed out waiting for slot lsub5_slot catalog_xmin"; + +# Advance XIDs past 2x max_slot_xid_age so the slot's catalog_xmin is stale enough +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); + +# Vacuum a user table so OldestXmin does not include the slot's catalog_xmin, +# skipping the invalidation of the slot. +$primary5->safe_psql('postgres', "VACUUM tab_int5"); +is( $primary5->safe_psql( + 'postgres', + qq[SELECT invalidation_reason IS NULL FROM pg_replication_slots WHERE slot_name = 'lsub5_slot';] + ), + 't', + 'logical slot not invalidated after vacuuming a data table'); + +# Vacuum a catalog table so OldestXmin includes the slot's catalog_xmin, +# triggering invalidation of the slot. +$primary5->safe_psql('postgres', "VACUUM pg_class"); +wait_for_xid_aged_invalidation($primary5, 'lsub5_slot'); +ok(1, "logical slot invalidated due to XID age (via vacuum)"); + +# Testcase end: Invalidate logical slot due to max_slot_xid_age GUC +# ================================================================== + +# =============================================================================== +# Testcase start: Invalidate logical slot on standby due to max_slot_xid_age GUC + +# Disable max_slot_xid_age on primary and recreate the streaming slot +$primary5->safe_psql( + 'postgres', + q{ +ALTER SYSTEM SET max_slot_xid_age = 0; +SELECT pg_reload_conf(); +}); +$primary5->safe_psql('postgres', + "SELECT pg_drop_replication_slot('sb5_slot')"); +$primary5->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb5_slot', true)"); +$standby5->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum = off +}); +$standby5->start; + +$primary5->wait_for_catchup($standby5); + +$standby5->create_logical_slot_on_standby($primary5, 'sb5_logical_slot', + 'postgres'); + +$standby5->poll_query_until( + 'postgres', qq[ + SELECT catalog_xmin IS NOT NULL + FROM pg_catalog.pg_replication_slots + WHERE slot_name = 'sb5_logical_slot'; +]) or die "Timed out waiting for sb5_logical_slot catalog_xmin"; + +# Advance XIDs on primary, replay on standby, then restartpoint to invalidate +$primary5->safe_psql('postgres', qq{CALL consume_xid(2 * $max_slot_xid_age)}); +$primary5->safe_psql('postgres', "CHECKPOINT"); +$primary5->wait_for_replay_catchup($standby5); +$standby5->safe_psql('postgres', "CHECKPOINT"); + +wait_for_xid_aged_invalidation($standby5, 'sb5_logical_slot'); +ok(1, "logical (standby) slot invalidated due to XID age (via restartpoint)"); + +$standby5->stop; +$primary5->stop; + +# Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC +# ============================================================================= + done_testing(); -- 2.47.3 [application/x-patch] v11-0002-Add-more-tests-for-XID-age-slot-invalidation.patch (6.3K, 3-v11-0002-Add-more-tests-for-XID-age-slot-invalidation.patch) download | inline diff: From 60084a4d54284c4a67d548830900a53ddd10450b Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Fri, 17 Apr 2026 18:31:10 +0000 Subject: [PATCH v11 2/3] Add more tests for XID age slot invalidation Consume XIDs up to wraparound WARNING limits with max_slot_xid_age matching vacuum_failsafe_age (1.6B). Verify that autovacuum invalidates the inactive replication slot (XID-age-based invalidation), unblocks datfrozenxid advancement, and prevents wraparound without any intervention. --- src/test/recovery/Makefile | 3 +- src/test/recovery/t/019_replslot_limit.pl | 130 ++++++++++++++++++++++ 2 files changed, 132 insertions(+), 1 deletion(-) diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile index d41aaaf8ae1..5c3d2c89941 100644 --- a/src/test/recovery/Makefile +++ b/src/test/recovery/Makefile @@ -12,7 +12,8 @@ EXTRA_INSTALL=contrib/pg_prewarm \ contrib/pg_stat_statements \ contrib/test_decoding \ - src/test/modules/injection_points + src/test/modules/injection_points \ + src/test/modules/xid_wraparound subdir = src/test/recovery top_builddir = ../../.. diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 6b7e4818bf4..6f92034f3d5 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -715,4 +715,134 @@ $primary5->stop; # Testcase end: Invalidate logical slot on standby due to max_slot_xid_age GUC # ============================================================================= +# ================================================================================= +# Testcase start: XID-age-based slot invalidation with autovacuum (production-like) + +# Standby sets slot xmin via HS feedback, disconnects, XIDs are consumed. +# max_slot_xid_age is set to vacuum_failsafe_age (1.6B) so autovacuum +# invalidates the slot before entering failsafe mode, unblocking +# datfrozenxid advancement and avoiding XID wraparound without manual +# VACUUM or downtime. + +# Verify server log shows slot invalidation by autovacuum worker +sub verify_slot_xid_aged_invalidation_in_server_log +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + my $log = slurp_file($node->logfile); + + # Verify the invalidation was performed by an autovacuum worker + like($log, + qr/autovacuum worker\[\d+\] LOG:\s+invalidating obsolete replication slot "$slot_name"/, + "server log: $slot_name invalidated by autovacuum worker"); + + # Verify DETAIL shows the xmin age exceeding max_slot_xid_age + like($log, + qr/autovacuum worker\[\d+\] DETAIL:\s+The slot's (?:catalog )?xmin age of (\d+) exceeds the configured "max_slot_xid_age" of $max_age by (\d+) transactions/, + "server log: DETAIL shows xmin age exceeds max_slot_xid_age $max_age"); + + # Extract xid age from the log and report for diagnostics + $log =~ + /The slot's (?:catalog )?xmin age of (\d+) exceeds the configured "max_slot_xid_age" of $max_age by (\d+)/; + my $log_xid_age = $1 // 'N/A'; + my $exceeded_by = $2 // 'N/A'; + diag "xid_age from server log=$log_xid_age, exceeded_by=$exceeded_by, max_slot_xid_age=$max_age, consumed=$consumed_xids XIDs"; +} + +# Verify slot invalidation and wait for autovacuum to advance datfrozenxid +sub verify_invalidation_and_recovery +{ + my ($node, $slot_name, $max_age, $consumed_xids) = @_; + + return if $max_age == 0; + + wait_for_xid_aged_invalidation($node, $slot_name); + ok(1, 'autovacuum invalidated slot due to xid_aged'); + + verify_slot_xid_aged_invalidation_in_server_log($node, $slot_name, + $max_age, $consumed_xids); + + # Wait for autovacuum to advance datfrozenxid in all databases past the + # wraparound threshold. + $node->poll_query_until( + 'postgres', qq[ + SELECT NOT EXISTS ( + SELECT 1 FROM pg_database + WHERE age(datfrozenxid) > 2000000000 + ); + ]) or die "Timed out waiting for autovacuum to advance datfrozenxid in all databases"; +} + +my $primary6 = PostgreSQL::Test::Cluster->new('primary6'); +$primary6->init(allows_streaming => 'logical'); + +$max_slot_xid_age = 1600000000; # matches vacuum_failsafe_age default +$primary6->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age +autovacuum_naptime = 1s +}); + +$primary6->start; +$primary6->safe_psql('postgres', "CREATE EXTENSION xid_wraparound"); + +$backup_name = 'backup6'; +$primary6->backup($backup_name); + +my $standby6 = PostgreSQL::Test::Cluster->new('standby6'); +$standby6->init_from_backup($primary6, $backup_name, has_streaming => 1); +$standby6->append_conf( + 'postgresql.conf', q{ +primary_slot_name = 'sb6_slot' +hot_standby_feedback = on +wal_receiver_status_interval = 1 +}); + +$primary6->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('sb6_slot', true)"); + +$standby6->start; + +$primary6->safe_psql('postgres', + "CREATE TABLE tab_int6 AS SELECT generate_series(1,10) AS a"); +$primary6->wait_for_catchup($standby6); + +$primary6->poll_query_until( + 'postgres', qq[ + SELECT xmin IS NOT NULL FROM pg_replication_slots + WHERE slot_name = 'sb6_slot'; +]) or die "Timed out waiting for sb6_slot xmin from HS feedback"; + +# Stop standby; slot xmin persists and holds back datfrozenxid +$standby6->stop; + +# Consume XIDs in 50M chunks; autovacuum (naptime=1s) will invalidate the +# slot once xmin age exceeds max_slot_xid_age. +my $logstart6 = -s $primary6->logfile; +my $chunk = 50_000_000; +my $max_xids = 2_200_000_000; +my $consumed = 0; + +while ($consumed < $max_xids) +{ + $primary6->safe_psql('postgres', "SELECT consume_xids($chunk)"); + $consumed += $chunk; + my $remaining = $max_xids - $consumed; + diag "consumed $consumed / $max_xids XIDs ($remaining remaining)"; +} + +verify_invalidation_and_recovery($primary6, 'sb6_slot', + $max_slot_xid_age, $consumed); + +# Consume 1B more XIDs — combining with the 2.2B consumed above, the total +# of 3.2B exceeds the 2^31 (~2.1B) usable XID space (xidStopLimit), i.e. +# more than one full wraparound cycle, proving the system is healthy. +$primary6->safe_psql('postgres', "SELECT consume_xids(1000000000)"); +ok(1, 'writes succeed after autovacuum invalidated the slot'); + +$primary6->stop; + +# Testcase end: XID-age-based slot invalidation with autovacuum (production-like) +# ================================================================================ + done_testing(); -- 2.47.3 [application/x-patch] v11-0003-Avoid-concurrent-XID-age-slot-invalidation-attem.patch (12.2K, 4-v11-0003-Avoid-concurrent-XID-age-slot-invalidation-attem.patch) download | inline diff: From 7e17543594722054a9310a15da9b250d476e436e Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Fri, 17 Apr 2026 18:31:35 +0000 Subject: [PATCH v11 3/3] Avoid concurrent XID-age slot invalidation attempts Multiple processes (autovacuum workers, backends running VACUUM) can concurrently attempt to invalidate the same replication slot due to XID age, causing them to wait on the same ConditionVariable while waiting for a slow walsender to release the slot. Add an invalidating_proc field to ReplicationSlot to track which process is currently attempting invalidation. When a process sees that another live process is already working on a slot, it skips the slot and defers to a subsequent cycle. This prevents unnecessary blocking during XID-age based slot invalidation. Added an injection point in the walsender to mimic slow SIGTERM processing and a TAP test for the concurrent slot invalidation. --- src/backend/replication/slot.c | 75 ++++++++++ src/backend/tcop/postgres.c | 11 ++ src/include/replication/slot.h | 7 + src/test/recovery/t/019_replslot_limit.pl | 161 ++++++++++++++++++++++ 4 files changed, 254 insertions(+) diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index fe43f5c8820..7cef325a888 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -233,6 +233,7 @@ ReplicationSlotsShmemInit(void *arg) /* everything else is zeroed by the memset above */ slot->active_proc = INVALID_PROC_NUMBER; + slot->invalidating_proc = INVALID_PROC_NUMBER; SpinLockInit(&slot->mutex); LWLockInitialize(&slot->io_in_progress_lock, LWTRANCHE_REPLICATION_SLOT_IO); @@ -501,6 +502,7 @@ ReplicationSlotCreate(const char *name, bool db_specific, slot->last_saved_restart_lsn = InvalidXLogRecPtr; slot->inactive_since = 0; slot->slotsync_skip_reason = SS_SKIP_NONE; + slot->invalidating_proc = INVALID_PROC_NUMBER; /* * Create the slot on disk. We haven't actually marked the slot allocated @@ -2052,6 +2054,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, int last_signaled_pid = 0; bool released_lock = false; bool invalidated = false; + bool am_invalidating = false; TimestampTz inactive_since = 0; for (;;) @@ -2112,6 +2115,59 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, break; } + /* + * Skip XID-age invalidation if another process is already + * invalidating this slot. + * + * Check if another process is already trying to invalidate this slot. + * If so, skip it to avoid multiple processes blocking on the same CV + * sleep. The first process will complete the invalidation attempt; + * others defer to a subsequent cycle. + * + * We handle this only for XID-age invalidation because multiple + * processes (autovacuum workers, backends running VACUUM, the + * checkpointer) can attempt it concurrently, making it likely that + * several end up blocking on the same ConditionVariable while waiting + * for a slow walsender to release the slot. Invalidation due to other + * causes can also involve multiple processes (e.g., on a standby, the + * checkpointer and the startup process may attempt to invalidate a + * slot for RS_INVAL_WAL_LEVEL and RS_INVAL_HORIZON respectively), but + * such concurrent attempts are rare in practice. + */ + if (invalidation_cause == RS_INVAL_XID_AGE && + s->invalidating_proc != INVALID_PROC_NUMBER) + { + int invalidating_pid; + + invalidating_pid = GetPGProcByNumber(s->invalidating_proc)->pid; + + if (invalidating_pid != 0 && + s->invalidating_proc != MyProcNumber) + { + /* Another live process is already invalidating this slot */ + SpinLockRelease(&s->mutex); + if (released_lock) + LWLockRelease(ReplicationSlotControlLock); + break; + } + + /* + * The previously recorded process has exited (pid == 0) or it's + * us. Reset and proceed with invalidation. + */ + s->invalidating_proc = INVALID_PROC_NUMBER; + } + + /* + * If the slot is active (we'll need to signal and wait), record + * ourselves as the invalidating process. + */ + if (s->active_proc != INVALID_PROC_NUMBER) + { + s->invalidating_proc = MyProcNumber; + am_invalidating = true; + } + slotname = s->data.name; active_proc = s->active_proc; @@ -2131,6 +2187,12 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, s->active_proc = MyProcNumber; s->data.invalidated = invalidation_cause; + /* + * Clear the invalidating process since we have completed the + * invalidation (acquired the slot and marked it invalid). + */ + s->invalidating_proc = INVALID_PROC_NUMBER; + /* * XXX: We should consider not overwriting restart_lsn and instead * just rely on .invalidated. @@ -2255,6 +2317,19 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, } } + /* + * If we set invalidating_proc but exited without completing the + * invalidation (e.g., the slot caught up while we were waiting, or the + * slot was dropped), clear it so other processes don't skip this slot. + */ + if (am_invalidating && !invalidated) + { + SpinLockAcquire(&s->mutex); + if (s->invalidating_proc == MyProcNumber) + s->invalidating_proc = INVALID_PROC_NUMBER; + SpinLockRelease(&s->mutex); + } + Assert(released_lock == !LWLockHeldByMe(ReplicationSlotControlLock)); *released_lock_out = released_lock; diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 2c1f14b7889..846ee613243 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -3375,6 +3375,17 @@ ProcessInterrupts(void) ProcDieSenderUid = 0; QueryCancelPending = false; /* ProcDie trumps QueryCancel */ LockErrorCleanup(); + +#ifdef USE_INJECTION_POINTS + /* + * Injection point used to simulate a walsender that is slow to + * respond to SIGTERM, allowing tests to verify concurrent slot + * invalidation behavior. + */ + if (am_walsender) + INJECTION_POINT("walsender-before-sigterm-exit", NULL); +#endif + /* As in quickdie, don't risk sending to client during auth */ if (ClientAuthInProgress && whereToSendOutput == DestRemote) whereToSendOutput = DestNone; diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index cad1d89b05b..fb085447d23 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -284,6 +284,13 @@ typedef struct ReplicationSlot * slotsync_skip_reason provides no practical benefit. */ SlotSyncSkipReason slotsync_skip_reason; + + /* + * Process currently attempting to invalidate this slot. + * INVALID_PROC_NUMBER means no invalidation is in progress. Protected by + * the slot's mutex. + */ + ProcNumber invalidating_proc; } ReplicationSlot; #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid) diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 6f92034f3d5..0897f4388d7 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -845,4 +845,165 @@ $primary6->stop; # Testcase end: XID-age-based slot invalidation with autovacuum (production-like) # ================================================================================ +# =============================================================================== +# Testcase start: Concurrent slot invalidation due to max_slot_xid_age GUC +# +# Two concurrent VACUUMs both try to invalidate the same active logical slot. +# An injection point delays the walsender's SIGTERM processing so that vacuum1 +# blocks on the CV waiting for the slot to be released. When vacuum2 runs, it +# sees that vacuum1 is already invalidating the same slot and skips without +# blocking. After the walsender is woken, vacuum1 completes the invalidation. + +# Skip if injection points are not available. +if ($ENV{enable_injection_points} ne 'yes') +{ + done_testing(); + exit; +} + +my $primary7 = PostgreSQL::Test::Cluster->new('primary7'); +$primary7->init(allows_streaming => 'logical'); + +my $max_slot_xid_age7 = 100; +$primary7->append_conf( + 'postgresql.conf', qq{ +max_slot_xid_age = $max_slot_xid_age7 +autovacuum = off +shared_preload_libraries = 'injection_points' +}); + +$primary7->start; + +# Check if injection_points extension is available. +if (!$primary7->check_extension('injection_points')) +{ + $primary7->stop; + done_testing(); + exit; +} + +$primary7->safe_psql('postgres', 'CREATE EXTENSION injection_points'); + +# Helper to consume XIDs. +$primary7->safe_psql( + 'postgres', qq{ + CREATE PROCEDURE consume_xid(cnt int) + AS \$\$ + DECLARE + i int; + BEGIN + FOR i IN 1..cnt LOOP + EXECUTE 'SELECT pg_current_xact_id()'; + COMMIT; + END LOOP; + END; + \$\$ LANGUAGE plpgsql; +}); + + +# Create a logical slot (gets catalog_xmin immediately). +$primary7->safe_psql('postgres', + "SELECT pg_create_logical_replication_slot('lslot7', 'test_decoding')"); + +# Hold the slot active via pg_recvlogical. +my $pg_recvlog_stdout7 = ''; +my $pg_recvlog_stderr7 = ''; +my $connstr7 = $primary7->connstr('postgres'); +my $pg_recvlogical_handle7 = IPC::Run::start( + [ + 'pg_recvlogical', '-d', $connstr7, + '--slot', 'lslot7', '--start', + '-f', '/dev/null', '--no-loop' + ], + '>', \$pg_recvlog_stdout7, + '2>', \$pg_recvlog_stderr7); + +# Wait for the slot to become active. +$primary7->poll_query_until( + 'postgres', qq[ + SELECT active FROM pg_replication_slots WHERE slot_name = 'lslot7'; +]) or die "Timed out waiting for slot lslot7 to become active"; + +# Make the walsender block before processing SIGTERM. +$primary7->safe_psql('postgres', + "SELECT injection_points_attach('walsender-before-sigterm-exit', 'wait')"); + +# Make the slot's catalog_xmin stale. +$primary7->safe_psql('postgres', + qq{CALL consume_xid(2 * $max_slot_xid_age7)}); + +# Launch vacuum1 on a catalog table: the logical slot holds catalog_xmin, +# so only catalog VACUUMs see it as OldestXmin and trigger invalidation. +# vacuum1 will SIGTERM the walsender, then block on the CV. +my $vacuum1 = $primary7->background_psql('postgres'); +my $vacuum2 = $primary7->background_psql('postgres'); + +$vacuum1->query_until( + qr/starting_vacuum/, + q(\echo starting_vacuum +VACUUM pg_class; +\echo vacuum1_done +)); + +# Wait for the walsender to hit the injection point. +$primary7->wait_for_event('walsender', 'walsender-before-sigterm-exit'); + +# Verify vacuum1 is blocked on the CV. +$primary7->poll_query_until( + 'postgres', qq[ + SELECT count(*) = 1 FROM pg_stat_activity + WHERE wait_event = 'ReplicationSlotDrop' + AND backend_type = 'client backend'; +]) or die "Timed out waiting for vacuum1 to block on slot CV"; + +# Launch vacuum2 on a different catalog table: it also computes the catalog +# horizon and sees the slot needs invalidation, but finds vacuum1 is already +# invalidating the same slot (via invalidating_proc) and skips. +$vacuum2->query_until( + qr/starting_vacuum/, + q(\echo starting_vacuum +VACUUM pg_type; +\echo vacuum2_done +)); + +# vacuum2 completes without blocking. +$vacuum2->query_until(qr/vacuum2_done/, ''); + +# Verify only vacuum1 is waiting on ReplicationSlotDrop. +$result = $primary7->safe_psql( + 'postgres', qq[ + SELECT count(*) FROM pg_stat_activity + WHERE wait_event = 'ReplicationSlotDrop' + AND backend_type = 'client backend'; +]); +is($result, '1', + 'only vacuum1 blocks on CV; vacuum2 skips via invalidating_proc'); + +# Wake up the walsender so it can exit and release the slot. +$primary7->safe_psql('postgres', + "SELECT injection_points_wakeup('walsender-before-sigterm-exit')"); + +# Detach the injection point. +$primary7->safe_psql('postgres', + "SELECT injection_points_detach('walsender-before-sigterm-exit')"); + +# Wait for pg_recvlogical to exit. +$pg_recvlogical_handle7->finish; + +# Wait for vacuum1 to complete now that the walsender has released the slot. +$vacuum1->query_until(qr/vacuum1_done/, ''); + +# Verify the slot was invalidated. +wait_for_xid_aged_invalidation($primary7, 'lslot7'); +ok(1, + "concurrent VACUUM: vacuum1 blocks on CV, vacuum2 skips via invalidating_proc" +); + +$vacuum1->quit; +$vacuum2->quit; +$primary7->stop; + +# Testcase end: Concurrent slot invalidation due to max_slot_xid_age GUC +# =============================================================================== + done_testing(); -- 2.47.3 ^ permalink raw reply [nested|flat] 31+ messages in thread
end of thread, other threads:[~2026-04-23 18:11 UTC | newest] Thread overview: 31+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-09-18 17:20 Introduce XID age based replication slot invalidation John H <[email protected]> 2025-09-19 08:07 ` Hayato Kuroda (Fujitsu) <[email protected]> 2025-09-19 23:42 ` John H <[email protected]> 2026-03-20 16:10 ` Bharath Rupireddy <[email protected]> 2026-03-21 06:28 ` SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-23 16:00 ` Bharath Rupireddy <[email protected]> 2026-03-23 23:36 ` Masahiko Sawada <[email protected]> 2026-03-24 21:42 ` Bharath Rupireddy <[email protected]> 2026-03-25 06:50 ` Masahiko Sawada <[email protected]> 2026-03-25 19:17 ` Bharath Rupireddy <[email protected]> 2026-03-26 09:48 ` SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-26 10:42 ` SATYANARAYANA NARLAPURAM <[email protected]> 2026-03-26 21:49 ` Masahiko Sawada <[email protected]> 2026-03-28 18:03 ` Bharath Rupireddy <[email protected]> 2026-03-29 20:16 ` Srinath Reddy Sadipiralla <[email protected]> 2026-03-30 01:35 ` Bharath Rupireddy <[email protected]> 2026-03-31 00:13 ` Masahiko Sawada <[email protected]> 2026-03-31 17:20 ` Bharath Rupireddy <[email protected]> 2026-04-01 19:38 ` Masahiko Sawada <[email protected]> 2026-04-01 21:21 ` Bharath Rupireddy <[email protected]> 2026-04-03 19:04 ` Bharath Rupireddy <[email protected]> 2026-04-05 08:03 ` Masahiko Sawada <[email protected]> 2026-04-06 02:52 ` Bharath Rupireddy <[email protected]> 2026-04-06 08:44 ` Masahiko Sawada <[email protected]> 2026-04-06 17:42 ` Bharath Rupireddy <[email protected]> 2026-04-07 14:39 ` Srinath Reddy Sadipiralla <[email protected]> 2026-04-16 05:03 ` Bharath Rupireddy <[email protected]> 2026-04-23 18:11 ` Bharath Rupireddy <[email protected]> 2026-03-31 07:25 ` Hayato Kuroda (Fujitsu) <[email protected]> 2026-03-31 16:45 ` Bharath Rupireddy <[email protected]> 2025-09-25 00:18 ` Bharath Rupireddy <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox