public inbox for [email protected]
help / color / mirror / Atom feedFrom: SATYANARAYANA NARLAPURAM <[email protected]>
To: Alexander Korotkov <[email protected]>
To: PostgreSQL Hackers <[email protected]>
To: PostgreSQL-development <[email protected]>
Subject: Bug: WAIT FOR LSN crashes with assertion failure inside PL/pgSQL DO blocks and procedures
Date: Wed, 8 Apr 2026 19:03:31 -0700
Message-ID: <CAHg+QDcN-n3NUqgRtj=BQb9fFQmH8-DeEROCr=PDbw_BBRKOYA@mail.gmail.com> (raw)
Hi hackers, Alexander,
An assertion failure (server crash in assert-enabled builds) occurs when
WAIT FOR LSN ... INTO is used inside PL/pgSQL DO blocks or within void
procedures.
Repro:
-- Run this on a standby
CREATE PROCEDURE test_wait()
LANGUAGE plpgsql AS $$
DECLARE
result text;
BEGIN
WAIT FOR LSN '0/1234' INTO result;
RAISE NOTICE '%', result;
END;
$$;
CALL test_wait();
The WAIT FOR itself succeeds, but the very next PL/pgSQL statement that
requires a snapshot crashes the backend with:
TRAP: failed Assert("portal->portalSnapshot == NULL"),
File: "pquery.c", Line: 1776
Attached patches for both the test case and a potential fix. Please review.
Thanks,
Satya
Attachments:
[application/octet-stream] 0002-Add-regression-tests-for-WAIT-FOR-LSN-inside-PL-pgSQ.patch (2.1K, 3-0002-Add-regression-tests-for-WAIT-FOR-LSN-inside-PL-pgSQ.patch)
download | inline diff:
From d82c0f0d207537defadc25e8ef8f69cd3dc28b32 Mon Sep 17 00:00:00 2001
From: Satya Narlapuram <[email protected]>
Date: Thu, 9 Apr 2026 00:50:43 +0000
Subject: [PATCH v1 2/2] Add regression tests for WAIT FOR LSN inside PL/pgSQL
Add test section 7f to 049_wait_for_lsn.pl covering WAIT FOR LSN ...
INTO inside PL/pgSQL DO blocks and procedures.
---
src/test/recovery/t/049_wait_for_lsn.pl | 35 +++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/src/test/recovery/t/049_wait_for_lsn.pl b/src/test/recovery/t/049_wait_for_lsn.pl
index bf61b8c4..11d3611d 100644
--- a/src/test/recovery/t/049_wait_for_lsn.pl
+++ b/src/test/recovery/t/049_wait_for_lsn.pl
@@ -587,6 +587,41 @@ $output = $node_primary->safe_psql('postgres',
ok($output >= 0,
"multiple primary_flush waiters: primary flushed WAL up to target LSN");
+# Use an already-replayed LSN so the WAIT returns immediately on standby.
+my $wait_into_lsn = $node_standby->safe_psql('postgres',
+ "SELECT pg_last_wal_replay_lsn()");
+
+# 7f. Test DO block with WAIT FOR INTO followed by RAISE NOTICE on standby.
+$node_standby->safe_psql(
+ 'postgres', qq[
+ DO \$\$
+ DECLARE
+ result text;
+ BEGIN
+ WAIT FOR LSN '${wait_into_lsn}' INTO result;
+ RAISE NOTICE 'got: %', result;
+ END;
+ \$\$;]);
+ok(1, "WAIT FOR INTO in DO block does not crash on standby");
+
+# Test void procedure with WAIT FOR INTO followed by RAISE on standby.
+$node_primary->safe_psql(
+ 'postgres', qq[
+ CREATE PROCEDURE test_wait_into_raise()
+ LANGUAGE plpgsql AS \$\$
+ DECLARE
+ result text;
+ BEGIN
+ WAIT FOR LSN '${wait_into_lsn}' INTO result;
+ RAISE NOTICE 'wait result: %', result;
+ END;
+ \$\$;]);
+$node_primary->wait_for_catchup($node_standby);
+
+$node_standby->safe_psql('postgres',
+ "CALL test_wait_into_raise()");
+ok(1, "WAIT FOR INTO in void procedure does not crash on standby");
+
# 8. Check that the standby promotion terminates all standby wait modes. Start
# waiting for unreachable LSNs with standby_replay, standby_write, and
# standby_flush modes, then promote. Check the log for the relevant error
--
2.43.0
[application/octet-stream] 0001-Fix-assertion-failure-in-WAIT-FOR-LSN-when-used-insi.patch (2.4K, 4-0001-Fix-assertion-failure-in-WAIT-FOR-LSN-when-used-insi.patch)
download | inline diff:
From 16ace60f82062f0f13fc1814085b045aedd6f202 Mon Sep 17 00:00:00 2001
From: Satya Narlapuram <[email protected]>
Date: Thu, 9 Apr 2026 00:50:20 +0000
Subject: [PATCH v1 1/2] Fix assertion failure in WAIT FOR LSN when used inside
PL/pgSQL
WAIT FOR LSN ... INTO inside a DO block crashed with an assertion failure in
EnsurePortalSnapshotExists():
TRAP: failed Assert("portal->portalSnapshot == NULL"),
File: "pquery.c", Line: 1776
The root cause was that ExecWaitStmt() called PopActiveSnapshot() to
release the snapshot before waiting, but did not clear the active
portal's portalSnapshot pointer. When the next PL/pgSQL statement
requiring a snapshot called EnsurePortalSnapshotExists(), it found
a stale non-NULL portalSnapshot and hit the assertion.
PortalRunUtility() normally handles this cleanup in the top-level
execution path, but when WAIT FOR runs through SPI (PL/pgSQL),
ProcessUtility is called directly without the PortalRunUtility
wrapper, so the portalSnapshot was never cleared.
Fix by clearing ActivePortal->portalSnapshot after PopActiveSnapshot()
in ExecWaitStmt(), matching what PortalRunUtility() does.
---
src/backend/commands/wait.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 85fcd463..8d245dbd 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -24,6 +24,7 @@
#include "executor/executor.h"
#include "parser/parse_node.h"
#include "storage/proc.h"
+#include "tcop/pquery.h"
#include "utils/builtins.h"
#include "utils/guc.h"
#include "utils/pg_lsn.h"
@@ -146,10 +147,20 @@ ExecWaitStmt(ParseState *pstate, WaitStmt *stmt, DestReceiver *dest)
* PlannedStmtRequiresSnapshot(), even in an atomic context, CallStmt is
* processed with a snapshot. Thankfully, we can pop this snapshot,
* because PortalRunUtility() can tolerate this.
+ *
+ * When running inside PL/pgSQL, the active
+ * portal may hold a pointer to this snapshot in portalSnapshot. We must
+ * clear it so that EnsurePortalSnapshotExists() doesn't trip an assertion
+ * when the next PL/pgSQL statement tries to re-establish a snapshot.
*/
if (ActiveSnapshotSet())
+ {
PopActiveSnapshot();
+ if (ActivePortal && ActivePortal->portalSnapshot != NULL)
+ ActivePortal->portalSnapshot = NULL;
+ }
+
/*
* At second, invalidate a catalog snapshot if any. And we should be done
* with the preparation.
--
2.43.0
view thread (11+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Bug: WAIT FOR LSN crashes with assertion failure inside PL/pgSQL DO blocks and procedures
In-Reply-To: <CAHg+QDcN-n3NUqgRtj=BQb9fFQmH8-DeEROCr=PDbw_BBRKOYA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox