public inbox for [email protected]help / color / mirror / Atom feed
Fix pg_upgrade to detect invalid logical replication slots on PG19 5+ messages / 4 participants [nested] [flat]
* Fix pg_upgrade to detect invalid logical replication slots on PG19 @ 2026-04-20 08:57 Lakshmi N <[email protected]> 0 siblings, 2 replies; 5+ messages in thread From: Lakshmi N @ 2026-04-20 08:57 UTC (permalink / raw) To: [email protected] <[email protected]> Hi Hackers, The PG19-optimized slot catchup query uses a CTE that filters on invalidation_reason IS NULL, then cross-joins it with the main slot query. When ALL logical slots in a database are invalid, the CTE returns zero rows, and the cross join produces an empty result set. This causes pg_upgrade to silently skip those slots entirely -- neither detecting them as invalid (which should block the upgrade) nor attempting to migrate them. The pre-PG19 query path does not have this problem because it queries pg_replication_slots directly without a cross join. This may not impact upgrade to PG19 but will change the behavior for PG20 upgrade. Fix by changing the cross join to a LEFT JOIN, so that invalid slots still appear in the result set with NULL caught_up values. Regards, Lakshmi Attachments: [application/octet-stream] 0001-Fix-pg_upgrade-to-detect-invalid-logical-replication.patch (965B, 3-0001-Fix-pg_upgrade-to-detect-invalid-logical-replication.patch) download | inline diff: From 6a80958204364c94b511e7728f83281013cdbf06 Mon Sep 17 00:00:00 2001 From: Lakshmi N <[email protected]> Date: Mon, 20 Apr 2026 01:47:50 -0700 Subject: [PATCH] Fix pg_upgrade to detect invalid logical replication slots on PG19 --- src/bin/pg_upgrade/info.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c index 8c5679b8097..37fff93892f 100644 --- a/src/bin/pg_upgrade/info.c +++ b/src/bin/pg_upgrade/info.c @@ -743,7 +743,8 @@ get_old_cluster_logical_slot_infos_query(ClusterInfo *cluster) " confirmed_flush_lsn > last_pending_wal " "END as caught_up, " "invalidation_reason IS NOT NULL as invalid " - "FROM pg_catalog.pg_replication_slots, check_caught_up " + "FROM pg_catalog.pg_replication_slots " + "LEFT JOIN check_caught_up ON true " "WHERE slot_type = 'logical' AND " "database = current_database() AND " "temporary IS FALSE "; -- 2.43.0 ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: Fix pg_upgrade to detect invalid logical replication slots on PG19 @ 2026-04-22 04:27 shveta malik <[email protected]> parent: Lakshmi N <[email protected]> 1 sibling, 1 reply; 5+ messages in thread From: shveta malik @ 2026-04-22 04:27 UTC (permalink / raw) To: Lakshmi N <[email protected]>; Sawada Masahiko <[email protected]>; +Cc: [email protected] <[email protected]>; shveta malik <[email protected]> On Mon, Apr 20, 2026 at 2:28 PM Lakshmi N <[email protected]> wrote: > > Hi Hackers, > > The PG19-optimized slot catchup query uses a CTE that filters on > invalidation_reason IS NULL, then cross-joins it with the main slot > query. When ALL logical slots in a database are invalid, the CTE > returns zero rows, and the cross join produces an empty result set. > This causes pg_upgrade to silently skip those slots entirely -- > neither detecting them as invalid (which should block the upgrade) > nor attempting to migrate them. > > The pre-PG19 query path does not have this problem because it queries > pg_replication_slots directly without a cross join. This may not impact > upgrade to PG19 but will change the behavior for PG20 upgrade. > > Fix by changing the cross join to a LEFT JOIN, > so that invalid slots still appear in the result set with NULL > caught_up values. > I agree with the problem here. Another way to solve this would be using a scalar subquery(see [1]), but that would reduce readability. Thus, I prefer a LEFT OUTER JOIN on TRUE here. There should also be no performance impact, since the right-hand side query will always return at most one row due to the LIMIT 1 clause. So IMO, the proposed solution is good. Copying Sawada-san, as he was the author of the original patch. [1]: SELECT slot_name, plugin, two_phase, failover, CASE WHEN invalidation_reason IS NOT NULL THEN FALSE ELSE ( (SELECT last_pending_wal FROM check_caught_up) IS NULL OR confirmed_flush_lsn > (SELECT last_pending_wal FROM check_caught_up) ) END as caught_up, invalidation_reason IS NOT NULL as invalid FROM pg_catalog.pg_replication_slots WHERE slot_type = 'logical' AND database = current_database() AND temporary IS FALSE; thanks Shveta ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: Fix pg_upgrade to detect invalid logical replication slots on PG19 @ 2026-04-22 08:02 Masahiko Sawada <[email protected]> parent: shveta malik <[email protected]> 0 siblings, 1 reply; 5+ messages in thread From: Masahiko Sawada @ 2026-04-22 08:02 UTC (permalink / raw) To: shveta malik <[email protected]>; +Cc: Lakshmi N <[email protected]>; [email protected] <[email protected]> On Tue, Apr 21, 2026 at 9:27 PM shveta malik <[email protected]> wrote: > > On Mon, Apr 20, 2026 at 2:28 PM Lakshmi N <[email protected]> wrote: > > > > Hi Hackers, > > > > The PG19-optimized slot catchup query uses a CTE that filters on > > invalidation_reason IS NULL, then cross-joins it with the main slot > > query. When ALL logical slots in a database are invalid, the CTE > > returns zero rows, and the cross join produces an empty result set. > > This causes pg_upgrade to silently skip those slots entirely -- > > neither detecting them as invalid (which should block the upgrade) > > nor attempting to migrate them. > > > > The pre-PG19 query path does not have this problem because it queries > > pg_replication_slots directly without a cross join. This may not impact > > upgrade to PG19 but will change the behavior for PG20 upgrade. Thank you for the report. > > > > Fix by changing the cross join to a LEFT JOIN, > > so that invalid slots still appear in the result set with NULL > > caught_up values. > > > > I agree with the problem here. I've confirmed the bug and the patch fixes it. The patch looks good to me, so I'm going to push it tomorrow barring any objections. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: Fix pg_upgrade to detect invalid logical replication slots on PG19 @ 2026-04-22 09:13 Chao Li <[email protected]> parent: Lakshmi N <[email protected]> 1 sibling, 0 replies; 5+ messages in thread From: Chao Li @ 2026-04-22 09:13 UTC (permalink / raw) To: Lakshmi N <[email protected]>; +Cc: [email protected] <[email protected]> > On Apr 20, 2026, at 16:57, Lakshmi N <[email protected]> wrote: > > Hi Hackers, > > The PG19-optimized slot catchup query uses a CTE that filters on > invalidation_reason IS NULL, then cross-joins it with the main slot > query. When ALL logical slots in a database are invalid, the CTE > returns zero rows, and the cross join produces an empty result set. > This causes pg_upgrade to silently skip those slots entirely -- > neither detecting them as invalid (which should block the upgrade) > nor attempting to migrate them. > > The pre-PG19 query path does not have this problem because it queries > pg_replication_slots directly without a cross join. This may not impact > upgrade to PG19 but will change the behavior for PG20 upgrade. > > Fix by changing the cross join to a LEFT JOIN, > so that invalid slots still appear in the result set with NULL > caught_up values. > > Regards, > Lakshmi > <0001-Fix-pg_upgrade-to-detect-invalid-logical-replication.patch> This changes the inner join to a left join, which preserves the rows from pg_replication_slots. As check_caught_up is intentionally a singleton CTE because ORDER BY … LIMIT 1, so there is no row multiplication risk. So, the patch looks good to me. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/ ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: Fix pg_upgrade to detect invalid logical replication slots on PG19 @ 2026-04-22 17:01 Masahiko Sawada <[email protected]> parent: Masahiko Sawada <[email protected]> 0 siblings, 0 replies; 5+ messages in thread From: Masahiko Sawada @ 2026-04-22 17:01 UTC (permalink / raw) To: shveta malik <[email protected]>; +Cc: Lakshmi N <[email protected]>; [email protected] <[email protected]> On Wed, Apr 22, 2026 at 1:02 AM Masahiko Sawada <[email protected]> wrote: > > On Tue, Apr 21, 2026 at 9:27 PM shveta malik <[email protected]> wrote: > > > > On Mon, Apr 20, 2026 at 2:28 PM Lakshmi N <[email protected]> wrote: > > > > > > Hi Hackers, > > > > > > The PG19-optimized slot catchup query uses a CTE that filters on > > > invalidation_reason IS NULL, then cross-joins it with the main slot > > > query. When ALL logical slots in a database are invalid, the CTE > > > returns zero rows, and the cross join produces an empty result set. > > > This causes pg_upgrade to silently skip those slots entirely -- > > > neither detecting them as invalid (which should block the upgrade) > > > nor attempting to migrate them. > > > > > > The pre-PG19 query path does not have this problem because it queries > > > pg_replication_slots directly without a cross join. This may not impact > > > upgrade to PG19 but will change the behavior for PG20 upgrade. > > Thank you for the report. > > > > > > > Fix by changing the cross join to a LEFT JOIN, > > > so that invalid slots still appear in the result set with NULL > > > caught_up values. > > > > > > > I agree with the problem here. > > I've confirmed the bug and the patch fixes it. The patch looks good to > me, so I'm going to push it tomorrow barring any objections. Pushed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 5+ messages in thread
end of thread, other threads:[~2026-04-22 17:01 UTC | newest] Thread overview: 5+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-04-20 08:57 Fix pg_upgrade to detect invalid logical replication slots on PG19 Lakshmi N <[email protected]> 2026-04-22 04:27 ` shveta malik <[email protected]> 2026-04-22 08:02 ` Masahiko Sawada <[email protected]> 2026-04-22 17:01 ` Masahiko Sawada <[email protected]> 2026-04-22 09:13 ` Chao Li <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox