Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFBZB-004pB3-07 for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Apr 2026 13:55:05 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wFBZA-0096hj-0g for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Apr 2026 13:55:04 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFBZ9-0096hX-30 for pgsql-hackers@lists.postgresql.org; Tue, 21 Apr 2026 13:55:03 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wFBZ7-00000002J1c-0lMB for pgsql-hackers@lists.postgresql.org; Tue, 21 Apr 2026 13:55:03 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 63LDsshe557993 for ; Tue, 21 Apr 2026 09:54:54 -0400 From: Tom Lane To: pgsql-hackers@lists.postgresql.org Subject: Non-robust plpgsql_trap test MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <557991.1776779694.1@sss.pgh.pa.us> Content-Transfer-Encoding: quoted-printable Date: Tue, 21 Apr 2026 09:54:54 -0400 Message-ID: <557992.1776779694@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk I've noticed a few buildfarm failures similar to [1]: # diff -U3 /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src= /expected/plpgsql_trap.out /repos/client-code-REL_19_1/HEAD/pgsql.build/sr= c/pl/plpgsql/src/results/plpgsql_trap.out # --- /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expe= cted/plpgsql_trap.out 2026-04-21 04:22:01.030204342 -0300 # +++ /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/resu= lts/plpgsql_trap.out 2026-04-21 04:29:54.795187855 -0300 # @@ -155,7 +155,7 @@ # begin; # set statement_timeout to 1000; # select trap_timeout(); # -NOTICE: nyeah nyeah, can't stop me # +NOTICE: caught others? # ERROR: end of function # CONTEXT: PL/pgSQL function trap_timeout() line 15 at RAISE # rollback; not ok 11 - plpgsql_trap 502 ms which is coming from unexpected behavior of this bit of plpgsql code: begin -- we assume this will take longer than 1 second: select count(*) into x from generate_series(1, 1_000_000_000_000); exception when others then raise notice 'caught others?'; when query_canceled then raise notice 'nyeah nyeah, can''t stop me'; end; The light bulb went on when I noticed a nearby failure from the same machine that was clearly traceable to out-of-disk-space. What happened here, I have no doubt, was that the "from generate_series" bit tried to make a large temporary file, ran out of space, and threw an appropriate error, causing us to take the "wrong" exception handler. Proposal: 1. Replace that query with something not so resource-intensive. I'm not really sure why we didn't just use "perform pg_sleep(10)". Maybe it didn't exist or didn't reliably wait 10 seconds at the time, but it does now. 2. Adjust the "when others" handler to report the actual error, to make this sort of thing easier to debug next time. regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=3Dcaiman&dt=3D= 2026-04-21%2007%3A21%3A57