public inbox for [email protected]
help / color / mirror / Atom feedFrom: Matheus Alcantara <[email protected]>
To: Tom Lane <[email protected]>
Cc: [email protected]
Cc: [email protected]
Subject: Re: BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct
Date: Mon, 01 Jun 2026 19:14:34 -0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
On Thu May 28, 2026 at 12:12 PM -03, Tom Lane wrote:
> "Matheus Alcantara" <[email protected]> writes:
>> On Fri May 15, 2026 at 8:11 AM -03, PG Bug reporting form wrote:
>>> The root cause is that srfstate->savedargs is tied to proc->mcxt (which can
>>> be deleted at any per-call boundary) rather than to
>>> funcctx->multi_call_memory_ctx (which lives for the entire SRF lifetime).
>
>> Option A seems to fix the issue (see attached patch) but I've found
>> another issue while playing with this that I think it's related:
>> ...
>> This is because when PLy_procedure_delete() is executed on
>> PLy_procedure_get() it also destroy information related with recursive
>> functions, such as "calldepth", "argstack" and "globals" which cause the
>> assert failure Assert(proc->calldepth > 0) on PLy_global_args_pop() when
>> it's executed on PG_CATCH block on PLy_exec_function() or EXC_BAD_ACCESS
>> when accessing "argstack" or "globals".
>
> Yeah. The bigger picture though is: if we are re-entrantly calling
> either a recursive function or a SRF, we should not destroy any of the
> existing state, nor do we want to replace the function body. The only
> way to have sane behavior is to keep executing the same function body
> until the execution instance (recursion level or continued SRF) is
> done. So these concerns about associated state are only part of the
> problem.
>
> plpgsql ran into this years ago, and its solution has been to maintain
> a reference count on each function parsetree and not destroy an
> obsoleted parsetree till the reference count goes to zero. I've had
> in the back of my head that the other PLs need to do likewise, but it
> hasn't gotten to the front of the to-do list, mainly because the other
> PLs are much less used and so field complaints about this have been
> rare. I had hoped also that the language interpreters underlying the
> other PLs might solve some of this for us, but it's unclear to what
> extent they help. Certainly it's not cool to be clobbering our own
> execution state that's outside the language interpreter.
>
> We might want to go as far as converting the other PLs to use the
> utils/cache/funccache.c infrastructure, but perhaps there is a
> less invasive fix. Certainly, a fix based on funccache.c could not
> be back-patched. (On the other hand, given the rarity of complaints,
> perhaps a HEAD-only fix is acceptable.)
>
I've been exploring the funccache.c approach for plpython. The main
challenge is that plpython uses SFRM_ValuePerCall for SRFs, whereas
plpgsql uses SFRM_Materialize. This means plpgsql can simply increment
use_count at the start of plpgsql_call_handler() and decrement it at the
end, since all results are produced in a single call. For plpython,
ExecMakeTableFunctionResult() calls the handler multiple times, with
use_count returning to zero between calls.
With ValuePerCall, cached_function_compile() may try to re-create an
invalid cache entry because use_count can be 0 while
ExecMakeTableFunctionResult() is in the middle of its loop. In that
case, the SRFState would be lost for the currently running plpython
function.
I'm still not sure how to proceed here but It seems like we would need
some refactoring in plpython to make it work with funccache. Not sure if
changing ValuePerCall to Materialize is a way to go or perhaps there's
another way to fix this.
I've also tried to fix this without funccache, but it seems like we
would end up implementing something similar anyway. That might be a way
to go, but I'm also not sure if it's the best path.
Thoughts?
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox