public inbox for [email protected]
help / color / mirror / Atom feedBUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct
2+ messages / 2 participants
[nested] [flat]
* BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct
@ 2026-05-15 11:11 PG Bug reporting form <[email protected]>
0 siblings, 1 reply; 2+ messages in thread
From: PG Bug reporting form @ 2026-05-15 11:11 UTC (permalink / raw)
To: [email protected]; +Cc: [email protected]
The following bug has been logged on the website:
Bug reference: 19480
Logged by: Andrzej Doros
Email address: [email protected]
PostgreSQL version: 17.9
Operating system: Ubuntu 22.04.5 LTS (x86_64), kernel 5.15, glibc 2.
Description:
PostgreSQL version: 17.9 (production crash), confirmed identical on 17.10
OS: Ubuntu 22.04.5 LTS, x86_64, kernel 5.15, glibc 2.35
Package: postgresql-plpython3-17 from pgdg apt repository
DESCRIPTION
-----------
A PL/Python set-returning function (SRF) crashes the backend with SIGSEGV
when
another session executes CREATE OR REPLACE FUNCTION (or ALTER FUNCTION) on
the
same function while the SRF is mid-iteration.
This is a use-after-free. srfstate->savedargs is allocated inside proc->mcxt
by
PLy_function_save_args() (plpy_exec.c:503). On each per-call SRF invocation,
plpython3_call_handler calls PLy_procedure_get(), which may call
PLy_procedure_delete(old_proc) -> MemoryContextDelete(old_proc->mcxt) if the
function's pg_proc row has changed (different xmin or ctid). After that,
srfstate->savedargs is a dangling pointer — it is not cleared. The next
PLy_function_restore_args() reads freed memory:
if (srfstate->savedargs) /* non-NULL dangling pointer
*/
PLy_function_restore_args(proc, srfstate->savedargs); /* reads
freed mem */
Inside PLy_function_restore_args (plpy_exec.c:551):
for (i = 0; i < savedargs->nargs; i++) /* nargs from freed memory */
{
if (proc->argnames[i] && ...)
PyDict_SetItemString(..., proc->argnames[i], ...);
When savedargs->nargs is garbage (e.g. 2056017128 in two production core
dumps),
proc->argnames[i] for large i reads an invalid pointer, which is passed to
PyDict_SetItemString -> PyUnicode_FromString -> strlen -> SIGSEGV.
CRASH STACK (two identical core dumps from production, PG 17.9, Ubuntu
22.04)
------------------------------------------------------------------------------
#0 __strlen_evex()
#1 PyUnicode_FromString(u=0x69ffff0000)
#2 PyDict_SetItemString(...)
#3 PLy_function_restore_args(proc=..., savedargs=...)
#4 PLy_exec_function(...)
#5 plpython3_call_handler(...)
#6 fmgr_security_definer(...)
#7 ExecMakeTableFunctionResult(...)
State from the newer core dump:
proc->proname = "tags_report_plpython"
proc->nargs = 1
proc->argnames[0]= "flavour"
savedargs->nargs = 2056017128 <- should be 1; contains garbage
savedargs->namedargs[0] = 'tags' <- still valid (not yet overwritten)
i = 4 <- loop has iterated far past argnames[]
TRIGGER CONDITION
-----------------
The pg_proc invalidation reaches Session A's backend when
AcceptInvalidationMessages() is called. This happens when Session A's Python
code calls plpy.execute() with a statement that acquires a NEW relation lock
(e.g. CREATE TEMP TABLE, any table not previously locked in this statement).
Simply calling plpy.execute("SELECT 1") is not sufficient because the lock
on
pg_proc is already held and subsequent requests are served from the
per-process
lock table without invoking AcceptInvalidationMessages.
In production the trigger is autovacuum on pg_proc (which moves the tuple's
ctid) or any concurrent DDL from another session. Long-running SRFs (hours)
are much more likely to hit this window.
STEPS TO REPRODUCE
------------------
Requires two concurrent sessions and PostgreSQL with plpython3u.
Session A — start and leave running:
CREATE EXTENSION IF NOT EXISTS plpython3u;
CREATE OR REPLACE FUNCTION repro_srf(flavour VARCHAR)
RETURNS TABLE (i BIGINT) AS $$
import time
for i in range(100):
-- CREATE TEMP TABLE acquires a new relation lock each iteration,
-- which causes AcceptInvalidationMessages to be called.
plpy.execute(f"CREATE TEMP TABLE _rt_{i} (x int)")
plpy.execute(f"DROP TABLE _rt_{i}")
time.sleep(0.3)
yield i
$$ LANGUAGE plpython3u VOLATILE;
SELECT count(*) FROM repro_srf('test');
Session B — while Session A is running (after ~2 seconds):
CREATE OR REPLACE FUNCTION repro_srf(flavour VARCHAR)
RETURNS TABLE (i BIGINT) AS $$
import time
for i in range(100):
plpy.execute(f"CREATE TEMP TABLE _rt_{i} (x int)")
plpy.execute(f"DROP TABLE _rt_{i}")
time.sleep(0.3)
yield i
$$ LANGUAGE plpython3u VOLATILE;
NOTE: In a minimal test without memory pressure, the freed savedargs memory
is often not overwritten quickly enough to produce a crash —
savedargs->nargs
accidentally retains its correct value of 1 and restore_args succeeds. Under
production load (long-running SRF, many Python allocations), the freed
region
is overwritten and the crash occurs.
The crash can be triggered deterministically with gdb by setting
savedargs->nargs to a large value immediately after PLy_procedure_delete
fires
(see gdb script below). This produces the identical crash stack seen in
production.
GDB CONFIRMATION (PostgreSQL 17.10)
-------------------------------------
The following gdb session was used to confirm the exact sequence:
(gdb) b PLy_procedure_delete
(gdb) commands 1
> printf "DELETE proname=%s mcxt=%p\n", proc->proname, proc->mcxt
> set $corrupt_next = 1
> c
> end
(gdb) b PLy_function_restore_args
(gdb) commands 2
> if $corrupt_next
> set {int}((long)savedargs + 24) = 2056017128
> set $corrupt_next = 0
> end
> c
> end
Output:
DELETE proname=repro_srf mcxt=0x5686641e1b20
[PLy_function_restore_args fires with savedargs=0x5686641e28e8]
[nargs set to 2056017128]
Program received signal SIGSEGV, Segmentation fault.
__strlen_avx2 ()
PostgreSQL log:
server process (PID 366) was terminated by signal 11: Segmentation fault
all server processes terminated; reinitializing
AFFECTED CODE
-------------
src/pl/plpython/plpy_exec.c, lines 503-506:
PLy_function_save_args allocates savedargs in proc->mcxt
src/pl/plpython/plpy_exec.c, lines 117-119:
PLy_function_restore_args is called with potentially dangling savedargs
(no check whether proc was rebuilt since savedargs was created)
src/pl/plpython/plpy_procedure.c, line 405 (PLy_procedure_delete):
MemoryContextDelete(proc->mcxt) frees savedargs without nulling
srfstate->savedargs
PROPOSED FIX
------------
The root cause is that srfstate->savedargs is tied to proc->mcxt (which can
be deleted at any per-call boundary) rather than to
funcctx->multi_call_memory_ctx (which lives for the entire SRF lifetime).
Option A — allocate savedargs in funcctx->multi_call_memory_ctx:
Change PLy_function_save_args to accept a MemoryContext parameter and pass
funcctx->multi_call_memory_ctx from PLy_exec_function. The saved PyObject*
references are valid regardless of which MemoryContext holds the struct.
Option B — detect proc rebuild and discard stale savedargs:
After PLy_procedure_get returns a new proc, check whether it differs from
the
proc that created srfstate->savedargs. If so, discard savedargs
(PLy_function_drop_args or simply set to NULL) and skip the restore.
^ permalink raw reply [nested|flat] 2+ messages in thread
* Re: BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct
@ 2026-05-25 22:26 Matheus Alcantara <[email protected]>
parent: PG Bug reporting form <[email protected]>
0 siblings, 0 replies; 2+ messages in thread
From: Matheus Alcantara @ 2026-05-25 22:26 UTC (permalink / raw)
To: [email protected]; [email protected]
On Fri May 15, 2026 at 8:11 AM -03, PG Bug reporting form wrote:
> The root cause is that srfstate->savedargs is tied to proc->mcxt (which can
> be deleted at any per-call boundary) rather than to
> funcctx->multi_call_memory_ctx (which lives for the entire SRF lifetime).
>
> Option A — allocate savedargs in funcctx->multi_call_memory_ctx:
> Change PLy_function_save_args to accept a MemoryContext parameter and pass
> funcctx->multi_call_memory_ctx from PLy_exec_function. The saved PyObject*
> references are valid regardless of which MemoryContext holds the struct.
>
> Option B — detect proc rebuild and discard stale savedargs:
> After PLy_procedure_get returns a new proc, check whether it differs from
> the
> proc that created srfstate->savedargs. If so, discard savedargs
> (PLy_function_drop_args or simply set to NULL) and skip the restore.
>
Hi, thank you for the very detailed bug report. I've managed to
reproduce the issue on master.
Option A seems to fix the issue (see attached patch) but I've found
another issue while playing with this that I think it's related:
CREATE OR REPLACE FUNCTION trigger_stack_overflow(x BIGINT)
RETURNS TABLE(i BIGINT) AS $$
import time
plpy.execute(f"CREATE TEMP TABLE _rt_{x} (x int)")
plpy.execute(f"DROP TABLE _rt_{x}")
time.sleep(0.3)
plpy.execute("SELECT trigger_stack_overflow(1)")
yield x
$$ LANGUAGE plpython3u VOLATILE;
Run SELECT trigger_stack_overflow(1) and on another session execute the
CREATE OR REPLACE and wait for the first session to crash with this
stacktrace:
frame #3: 0x000000010554a694 postgres`ExceptionalCondition(conditionName="proc->calldepth > 0", fileName="../src/pl/plpython/plpy_exec.c", lineNumber=701) at assert.c:65:2
frame #4: 0x0000000105e41984 plpython3.dylib`PLy_global_args_pop(proc=0x000000014b03cf00) at plpy_exec.c:701:2
frame #5: 0x0000000105e40d94 plpython3.dylib`PLy_exec_function(fcinfo=0x000000011e077738, proc=0x000000014b03cf00) at plpy_exec.c:264:3
The expected output from the first session should be something like
this:
ERROR: 54001: error fetching next item from iterator
DETAIL: spiexceptions.StatementTooComplex: error fetching next item from iterator
HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate.
This is because when PLy_procedure_delete() is executed on
PLy_procedure_get() it also destroy information related with recursive
functions, such as "calldepth", "argstack" and "globals" which cause the
assert failure Assert(proc->calldepth > 0) on PLy_global_args_pop() when
it's executed on PG_CATCH block on PLy_exec_function() or EXC_BAD_ACCESS
when accessing "argstack" or "globals".
Althrought changing the memory context where savedargs is allocated fix
the reported issue I think that the long term fix is to preserve such
necessary execution information during PLyProcedure re-creation. I'm
still studying the code to see if and how this can implemented.
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
From 61f46abd4509cc519de3e43adfd9e0b4fa0f6fcb Mon Sep 17 00:00:00 2001
From: Matheus Alcantara <[email protected]>
Date: Mon, 25 May 2026 19:22:09 -0300
Subject: [PATCH] plpython: Use correct memory context for savedargs
---
src/pl/plpython/plpy_exec.c | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/src/pl/plpython/plpy_exec.c b/src/pl/plpython/plpy_exec.c
index de0dad1f533..d93e800e0be 100644
--- a/src/pl/plpython/plpy_exec.c
+++ b/src/pl/plpython/plpy_exec.c
@@ -31,7 +31,7 @@ typedef struct PLySRFState
} PLySRFState;
static PyObject *PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc);
-static PLySavedArgs *PLy_function_save_args(PLyProcedure *proc);
+static PLySavedArgs *PLy_function_save_args(MemoryContext mctx, PLyProcedure *proc);
static void PLy_function_restore_args(PLyProcedure *proc, PLySavedArgs *savedargs);
static void PLy_function_drop_args(PLySavedArgs *savedargs);
static void PLy_global_args_push(PLyProcedure *proc);
@@ -176,8 +176,15 @@ PLy_exec_function(FunctionCallInfo fcinfo, PLyProcedure *proc)
* This won't be last call, so save argument values. We do
* this again each time in case the iterator is changing those
* values.
+ *
+ * We use funcctx->multi_call_memory_ctx to ensure savedargs
+ * survives across ValuePerCall invocations, but is cleaned up
+ * when the SRF completes. This also protects against the
+ * case where the procedure is delated (via
+ * PLy_procedure_delete ) while the SRF is running.
*/
- srfstate->savedargs = PLy_function_save_args(proc);
+ srfstate->savedargs = PLy_function_save_args(funcctx->multi_call_memory_ctx,
+ proc);
}
}
@@ -536,13 +543,13 @@ PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc)
* available via the proc's globals :-( ... but we're stuck with that now.
*/
static PLySavedArgs *
-PLy_function_save_args(PLyProcedure *proc)
+PLy_function_save_args(MemoryContext mctx, PLyProcedure *proc)
{
PLySavedArgs *result;
- /* saved args are always allocated in procedure's context */
+ /* Allocate in the caller-specified memory context */
result = (PLySavedArgs *)
- MemoryContextAllocZero(proc->mcxt,
+ MemoryContextAllocZero(mctx,
offsetof(PLySavedArgs, namedargs) +
proc->nargs * sizeof(PyObject *));
result->nargs = proc->nargs;
@@ -658,8 +665,14 @@ PLy_global_args_push(PLyProcedure *proc)
{
PLySavedArgs *node;
- /* Build a struct containing current argument values */
- node = PLy_function_save_args(proc);
+ /*
+ * Build a struct containing current argument values. We use
+ * proc->mcxt because the saved args must persist across the entire
+ * recursive call stack, which can span multiple function invocations.
+ * The procedure's memory context has the appropriate lifetime for
+ * this, and we explicitly free the struct when popping.
+ */
+ node = PLy_function_save_args(proc->mcxt, proc);
/*
* Push the saved argument values into the procedure's stack. Once we
--
2.50.1 (Apple Git-155)
Attachments:
[text/plain] 0001-plpython-Use-correct-memory-context-for-savedargs.patch (3.1K, 2-0001-plpython-Use-correct-memory-context-for-savedargs.patch)
download | inline diff:
From 61f46abd4509cc519de3e43adfd9e0b4fa0f6fcb Mon Sep 17 00:00:00 2001
From: Matheus Alcantara <[email protected]>
Date: Mon, 25 May 2026 19:22:09 -0300
Subject: [PATCH] plpython: Use correct memory context for savedargs
---
src/pl/plpython/plpy_exec.c | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/src/pl/plpython/plpy_exec.c b/src/pl/plpython/plpy_exec.c
index de0dad1f533..d93e800e0be 100644
--- a/src/pl/plpython/plpy_exec.c
+++ b/src/pl/plpython/plpy_exec.c
@@ -31,7 +31,7 @@ typedef struct PLySRFState
} PLySRFState;
static PyObject *PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc);
-static PLySavedArgs *PLy_function_save_args(PLyProcedure *proc);
+static PLySavedArgs *PLy_function_save_args(MemoryContext mctx, PLyProcedure *proc);
static void PLy_function_restore_args(PLyProcedure *proc, PLySavedArgs *savedargs);
static void PLy_function_drop_args(PLySavedArgs *savedargs);
static void PLy_global_args_push(PLyProcedure *proc);
@@ -176,8 +176,15 @@ PLy_exec_function(FunctionCallInfo fcinfo, PLyProcedure *proc)
* This won't be last call, so save argument values. We do
* this again each time in case the iterator is changing those
* values.
+ *
+ * We use funcctx->multi_call_memory_ctx to ensure savedargs
+ * survives across ValuePerCall invocations, but is cleaned up
+ * when the SRF completes. This also protects against the
+ * case where the procedure is delated (via
+ * PLy_procedure_delete ) while the SRF is running.
*/
- srfstate->savedargs = PLy_function_save_args(proc);
+ srfstate->savedargs = PLy_function_save_args(funcctx->multi_call_memory_ctx,
+ proc);
}
}
@@ -536,13 +543,13 @@ PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc)
* available via the proc's globals :-( ... but we're stuck with that now.
*/
static PLySavedArgs *
-PLy_function_save_args(PLyProcedure *proc)
+PLy_function_save_args(MemoryContext mctx, PLyProcedure *proc)
{
PLySavedArgs *result;
- /* saved args are always allocated in procedure's context */
+ /* Allocate in the caller-specified memory context */
result = (PLySavedArgs *)
- MemoryContextAllocZero(proc->mcxt,
+ MemoryContextAllocZero(mctx,
offsetof(PLySavedArgs, namedargs) +
proc->nargs * sizeof(PyObject *));
result->nargs = proc->nargs;
@@ -658,8 +665,14 @@ PLy_global_args_push(PLyProcedure *proc)
{
PLySavedArgs *node;
- /* Build a struct containing current argument values */
- node = PLy_function_save_args(proc);
+ /*
+ * Build a struct containing current argument values. We use
+ * proc->mcxt because the saved args must persist across the entire
+ * recursive call stack, which can span multiple function invocations.
+ * The procedure's memory context has the appropriate lifetime for
+ * this, and we explicitly free the struct when popping.
+ */
+ node = PLy_function_save_args(proc->mcxt, proc);
/*
* Push the saved argument values into the procedure's stack. Once we
--
2.50.1 (Apple Git-155)
^ permalink raw reply [nested|flat] 2+ messages in thread
end of thread, other threads:[~2026-05-25 22:26 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-05-15 11:11 BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct PG Bug reporting form <[email protected]>
2026-05-25 22:26 ` Matheus Alcantara <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox