Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wUAu9-000zSy-2s for pgsql-bugs@arkaria.postgresql.org; Mon, 01 Jun 2026 22:14:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wUAu8-00BLCx-2X for pgsql-bugs@arkaria.postgresql.org; Mon, 01 Jun 2026 22:14:40 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wUAu8-00BLCp-1i for pgsql-bugs@lists.postgresql.org; Mon, 01 Jun 2026 22:14:40 +0000 Received: from mail-yw1-x112d.google.com ([2607:f8b0:4864:20::112d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wUAu6-00000000e2p-2bpp for pgsql-bugs@lists.postgresql.org; Mon, 01 Jun 2026 22:14:39 +0000 Received: by mail-yw1-x112d.google.com with SMTP id 00721157ae682-7e16f05fd79so23761667b3.1 for ; Mon, 01 Jun 2026 15:14:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780352078; x=1780956878; darn=lists.postgresql.org; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=QsHTa0Nrc9ifj5XNNiFj9oDPvHwMPiLCa2ds1g+9ut4=; b=eTCXVq3OnZFwIucAeHLu8W9wMDCUnyM/JR2tefS5yfXR2/1mbGNKMBOi3y6mv0l4GX VIw4qtqOr9e1y8OekYQ41buuMPFvIBbU72N20+Ff+NS/dF/6A+8T1u4dIRdh3WIzdRsB ofo+FuoKDXg+Ujhj1/BUQUxhe4YRBADmaQ6Iix3VgxbxZgYKrA0zo20FRQQs1sC9eoc9 wV5JzCcL2i2X6sptte1sp/KjaG2IQ45YRjumNNEKvd/pWHDYX1aR87ILqDWnmb02XZ25 uEab2hbG3q2ZVHfTczxXRpLbiFfan/zjyXZNFyP1XOUwTGB48qkXCioObNloMJGUC22l YHNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780352078; x=1780956878; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=QsHTa0Nrc9ifj5XNNiFj9oDPvHwMPiLCa2ds1g+9ut4=; b=sXPDfBU0DGs25RP38GulSoBPkzPzJI/WNj2X02Awjyj0cWoPSCHbGW+W9qUeac6C3o 0B9umwoRfpB1+LY1ndFzsuGywdpVjUlMIOonS84UMDlCj3InhmVvdfRyYdAXZE/V6xkv IParxv5plJIQjbS/df9ZMrlR4DC7q+1S0cWoaKJucunD+CtUM2Jh+9m8ze0xD7LBJ6Qx TTS7mSWhHIG6lcxo+zq052czaT378rZwzYFqRekSZKjCwvFxpQuCdOxfH3+D1ss2xigH n+7LzCgpO9CLT1Hi7BU+NLC4fl6AsHVZUGGDUy9hudKEDGqs3nv6kC5StsRMsIUm9bD3 Eimw== X-Forwarded-Encrypted: i=1; AFNElJ+J71w4yTpH9xZA6C8Q2LQV+Af7Gh9aoYde8RNNP9RzYHFvlhBp9HIbqA5dQgaIcqYO92YehyB7iFPM@lists.postgresql.org X-Gm-Message-State: AOJu0YwngMllcsavj0bP0RxKl8VJifYj8H8Kkl+oVHPvCCJMQWmcqzXy m21ousYOTktLXPKW2BXxmpH4mQJnwCKoIVEisZwJEr/O38MSAJoYJNcv X-Gm-Gg: Acq92OEfHSB8utegFZgi6vY34yNxp57wMLzSWn3yYkOAJUwMjlM0l5JRyma+ssnrbA3 Mlj5MdShkpVHzKNovihDuom0LISUX6zVe5yNhXEU8vw1Cu+AeayMJ8i32Alj/877gYrnx8LxeEz /po6x0kpEuCweMLEnwrBD6K3GYP/uOM/YChjNEMvX46iFLyNVETz0i5lULDNcN68n0606dEDyHR iUUEqGutqlkoPdW2mxcWnh1m2pSSIqs0lWVkBuPl4nS1uE1BpnDvkA1XmIgAe8wpqDlA1SsVggu 7JIOOX74E3Ha8tWCuhX1Fut5/YSs+lwTIC/ACpiuT7WQZ8wBZ/AlKVych2K/nSZzx0owtzfU+DK Ko/574IXudFVRNaH8GBAajK/xboMWVhh4UGjdo6uA2dS3qTK1yr/1ev9O5Qo/XhJuQRJwf5IqVz Pswr1PazPh7HQ/ZztsAVBbhMLLoOhck+I1RLKdEU41nTA6Zgr2xK+1pdq4 X-Received: by 2002:a05:690c:6188:b0:7d0:5c67:200b with SMTP id 00721157ae682-7e053c6e5c6mr116374517b3.0.1780352077802; Mon, 01 Jun 2026 15:14:37 -0700 (PDT) Received: from localhost ([2804:14d:328a:a59c:3903:c1ee:d832:dc98]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7e8b410b6ebsm1973237b3.32.2026.06.01.15.14.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 01 Jun 2026 15:14:37 -0700 (PDT) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Mon, 01 Jun 2026 19:14:34 -0300 Message-Id: From: "Matheus Alcantara" To: "Tom Lane" Cc: , Subject: Re: BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct X-Mailer: aerc 0.21.0 References: <19480-f1f9fdce30462fc4@postgresql.org> <982975.1779981146@sss.pgh.pa.us> In-Reply-To: <982975.1779981146@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu May 28, 2026 at 12:12 PM -03, Tom Lane wrote: > "Matheus Alcantara" writes: >> On Fri May 15, 2026 at 8:11 AM -03, PG Bug reporting form wrote: >>> The root cause is that srfstate->savedargs is tied to proc->mcxt (which= can >>> be deleted at any per-call boundary) rather than to >>> funcctx->multi_call_memory_ctx (which lives for the entire SRF lifetime= ). > >> Option A seems to fix the issue (see attached patch) but I've found >> another issue while playing with this that I think it's related: >> ... >> This is because when PLy_procedure_delete() is executed on >> PLy_procedure_get() it also destroy information related with recursive >> functions, such as "calldepth", "argstack" and "globals" which cause the >> assert failure Assert(proc->calldepth > 0) on PLy_global_args_pop() when >> it's executed on PG_CATCH block on PLy_exec_function() or EXC_BAD_ACCESS >> when accessing "argstack" or "globals". > > Yeah. The bigger picture though is: if we are re-entrantly calling > either a recursive function or a SRF, we should not destroy any of the > existing state, nor do we want to replace the function body. The only > way to have sane behavior is to keep executing the same function body > until the execution instance (recursion level or continued SRF) is > done. So these concerns about associated state are only part of the > problem. > > plpgsql ran into this years ago, and its solution has been to maintain > a reference count on each function parsetree and not destroy an > obsoleted parsetree till the reference count goes to zero. I've had > in the back of my head that the other PLs need to do likewise, but it > hasn't gotten to the front of the to-do list, mainly because the other > PLs are much less used and so field complaints about this have been > rare. I had hoped also that the language interpreters underlying the > other PLs might solve some of this for us, but it's unclear to what > extent they help. Certainly it's not cool to be clobbering our own > execution state that's outside the language interpreter. > > We might want to go as far as converting the other PLs to use the > utils/cache/funccache.c infrastructure, but perhaps there is a > less invasive fix. Certainly, a fix based on funccache.c could not > be back-patched. (On the other hand, given the rarity of complaints, > perhaps a HEAD-only fix is acceptable.) > I've been exploring the funccache.c approach for plpython. The main challenge is that plpython uses SFRM_ValuePerCall for SRFs, whereas plpgsql uses SFRM_Materialize. This means plpgsql can simply increment use_count at the start of plpgsql_call_handler() and decrement it at the end, since all results are produced in a single call. For plpython, ExecMakeTableFunctionResult() calls the handler multiple times, with use_count returning to zero between calls. With ValuePerCall, cached_function_compile() may try to re-create an invalid cache entry because use_count can be 0 while ExecMakeTableFunctionResult() is in the middle of its loop. In that case, the SRFState would be lost for the currently running plpython function. I'm still not sure how to proceed here but It seems like we would need some refactoring in plpython to make it work with funccache. Not sure if changing ValuePerCall to Materialize is a way to go or perhaps there's another way to fix this. I've also tried to fix this without funccache, but it seems like we would end up implementing something similar anyway. That might be a way to go, but I'm also not sure if it's the best path. Thoughts? -- Matheus Alcantara EDB: https://www.enterprisedb.com