public inbox for [email protected]
help / color / mirror / Atom feedFrom: Alex Masterov <[email protected]>
To: Kyotaro Horiguchi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Subject: Re: BUG #18354: Aborted transaction aborted during cleanup when temp_file_limit exceeded
Date: Wed, 6 May 2026 13:26:51 +0200
Message-ID: <CA+8z=zu-mxXuBtnTVNcOzaGJhgeKSTyeAsK+AdRMr5_XZM1SDw@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<CAHewXN=ZEL1Rvw2fJmt8k3sO_3fb58gijv7GR-JnY8ujgGP0UA@mail.gmail.com>
<[email protected]>
<[email protected]>
Hi,
While running tests with Neon, we discovered an assertion failure that can
occur during re-entrant AbortTransaction() calls.
The issue arises when an error occurs during AbortTransaction() after
ProcArrayEndTransaction() has cleared MyProc->xid. If another error is
raised
during cleanup (e.g., in AtEOXact_Inval()), the PostgresMain error handler
invokes AbortCurrentTransaction() again. The second AbortTransaction() call
reads a still-valid s->transactionId (CleanupTransaction() hasn't run yet)
and passes it to ProcArrayEndTransaction(), which then hits:
Assert(TransactionIdIsValid(proc->xid))
because MyProc->xid was already cleared by the first call.
The attached patch fixes this by checking MyProc->xid validity before
calling
RecordTransactionAbort() and only passing a valid latestXid when
appropriate.
**Reproduction:**
This can be reproduced reliably using the injection_points extension:
1. Attach the injection point:
SELECT injection_points_attach('transaction-end-process-inval', 'error');
2. Create invalidation messages: CREATE TABLE test(id int);
3. Trigger abort: ROLLBACK;
Without the fix: assertion crash on ProcArrayEndTransaction()
With the fix applied: the script will panic with "ERRORDATA_STACK_SIZE
exceeded"
due to re-entrant error handling, demonstrating that the assertion is
resolved.
I've included a reproduction script and the fix that clearly shows both
behaviors.
**Files attached:**
- 0001-xact-Prevent-assertion-failure-in-re-entrant-Abort.patch
- repro_minimal_panic_if_fixed.sh
Thoughts?
Best regards,
Alexey
>
>
>
>
>
>
>
Attachments:
[application/octet-stream] 0001-xact-Prevent-assertion-failure-in-re-entrant-Abort.patch (1.6K, 3-0001-xact-Prevent-assertion-failure-in-re-entrant-Abort.patch)
download | inline diff:
commit d0a862d09016b2c02811821644688cee2c33089d
Author: Alexey Masterov <[email protected]>
Date: Wed May 6 11:01:31 2026 +0200
xact: Prevent assertion failure in re-entrant AbortTransaction calls
When an error occurs during AbortTransaction() after ProcArrayEndTransaction()
has cleared MyProc->xid, subsequent abort attempts could pass a valid
latestXid to ProcArrayEndTransaction() while proc->xid is already invalid,
triggering Assert(TransactionIdIsValid(proc->xid)).
Avoid this by checking MyProc->xid validity before calling
RecordTransactionAbort() and only passing a valid latestXid when appropriate.
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 48bc90c9673..c7819d7d9d4 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2979,7 +2979,20 @@ AbortTransaction(void)
* record.
*/
if (!is_parallel_worker)
- latestXid = RecordTransactionAbort(false);
+ {
+ /*
+ * Re-entrant abort: if another ERROR is raised during AbortTransaction()
+ * after ProcArrayEndTransaction() cleared our advertised XID, local
+ * transaction state can still report an XID until CleanupTransaction().
+ * Do not run RecordTransactionAbort again (duplicate WAL/clog) and do
+ * not pass a valid latestXid to ProcArrayEndTransaction with proc->xid
+ * already cleared.
+ */
+ if (TransactionIdIsValid(MyProc->xid))
+ latestXid = RecordTransactionAbort(false);
+ else
+ latestXid = InvalidTransactionId;
+ }
else
{
latestXid = InvalidTransactionId;
[text/x-sh] repro_minimal_panic_if_fixed.sh (1.1K, 4-repro_minimal_panic_if_fixed.sh)
download | inline:
#!/usr/bin/env bash
#
# repro_minimal_panic_if_fixed.sh -- Minimal reproduction with raw behavior
#
# This script demonstrates the pure difference between original bug and fixed version:
# โข Original bug: TRAP assertion failure (immediate crash)
# โข Bug fixed: PANIC ERRORDATA_STACK_SIZE exceeded (infinite loop)
#
# No mitigation, no auto-detach - shows the raw behavior difference.
set -euo pipefail
PSQL=${PSQL:-psql}
echo "โก Minimal Re-entrant AbortTransaction Reproduction"
echo "================================================"
echo ""
echo "Expected behavior:"
echo " ๐ Original bug: TRAP assertion failure"
echo " ๐ง Bug fixed: PANIC ERRORDATA_STACK_SIZE exceeded"
echo ""
$PSQL -c "CREATE EXTENSION IF NOT EXISTS injection_points;" >/dev/null
echo "๐งช Executing minimal reproduction (raw behavior)..."
$PSQL <<'SQL'
SELECT injection_points_set_local();
SELECT injection_points_attach('transaction-end-process-inval', 'error');
BEGIN;
CREATE TABLE _minimal_repro (id int);
ROLLBACK;
SQL
echo "๐ Clean completion - bug conditions not met or injection point didn't fire"
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: BUG #18354: Aborted transaction aborted during cleanup when temp_file_limit exceeded
In-Reply-To: <CA+8z=zu-mxXuBtnTVNcOzaGJhgeKSTyeAsK+AdRMr5_XZM1SDw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox