public inbox for [email protected]  
help / color / mirror / Atom feed
From: Amit Langote <[email protected]>
To: Evan Montgomery-Recht <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
Date: Wed, 8 Apr 2026 18:58:23 +0900
Message-ID: <CA+HiwqFdkJGBU9QmYkm6056SuhunGk7yFCuVP=2vBejJa+qhGg@mail.gmail.com> (raw)
In-Reply-To: <CA+HiwqFK8rXTNknxV0MMe++7W08g3kY6eeLK21A1xCrK2Wuk8Q@mail.gmail.com>
References: <CA+HiwqF4C0ws3cO+z5cLkPuvwnAwkSp7sfvgGj3yQ=Li6KNMqA@mail.gmail.com>
	<CA+HiwqGM6nvAV5O+=Nr+BXMPWOma0oeCRVzVP0XiLE8zX5TVAg@mail.gmail.com>
	<CA+HiwqGMaovCUgDbGxVGnK0Mrivr+ph3YE2Ws+47-ugyPb4f7g@mail.gmail.com>
	<CAFj8pRDaiBe_GOLk_yyYHTtPiDAAaLOM8u1-=Q3ZgXBTH+1igg@mail.gmail.com>
	<CA+HiwqGA5Ay_MR0eJEEbt4j6WrVh4F+AasTp8yCbs5aJLOJn6Q@mail.gmail.com>
	<CAEG8a3JM=NoqiTK0V6S9FNxZPvy1+C5F7rfafTtPKBVWnunL-g@mail.gmail.com>
	<CA+HiwqEyiLCY6MTLbOJXDdLNNQLaURYHvdW797MQgbjEK9od4Q@mail.gmail.com>
	<CAEG8a3+VBpwPf1Rm-ECD90whM9b3YnGhux5CVXdsL6khiBfzRQ@mail.gmail.com>
	<CA+HiwqF2UHzF0sKCp-F2a-U29rqh_9ZPy=f1h+Fh_=M8efj3pg@mail.gmail.com>
	<CAEG8a3L9Ew-WL8sxLROVOcypeaENPmd8qCmMvz4geoGL1TDGCA@mail.gmail.com>
	<CAEG8a3+nUFQo4sdPQF9xy0J73J8RFJ5U9A5+_kMosGDaZ+1sXA@mail.gmail.com>
	<[email protected]>
	<CAEG8a3JyKdizWvYsF+z_mA1BKy=dpW11iKVMOG=bk6Tbz6M1Bw@mail.gmail.com>
	<CAEG8a3+Hf4tvvbts29_k_AFhWQmRYfEo_SW4C5FY_140iKghBw@mail.gmail.com>
	<CA+HiwqFV-PY-3BxM6j5TaAiC3AwedDxo-6vwRSbvygg3zF+xAQ@mail.gmail.com>
	<CA+HiwqHpaisS-e+0gAgzh92qZAFxncAJMmmTRZZN=efoeTPgOg@mail.gmail.com>
	<CA+HiwqFwZ6WLRbY8R7VC7JVp5Jot6ktZOkr9wDxTjoK=W1SgdQ@mail.gmail.com>
	<CA+HiwqF_Kz=R8juHJBiOATvabWSOugK4VaGOxoJ_n=E2c7UM9w@mail.gmail.com>
	<CA+HiwqHCB7kcbspkhaLN9enoj5x=ehzhFM4PXDgWUUP8Px41GA@mail.gmail.com>
	<[email protected]>
	<CA+HiwqHpVtP485wEKuXdOkdoZWhvVvfFH40-og07Jp3MPx21eg@mail.gmail.com>
	<CAEG8a3JWHkJSXe9nNcVK7wnYKUEqWuMGFDhy5BynB_9tEjmEUg@mail.gmail.com>
	<CA+HiwqFjfumKrWy03q5M309xJJVYt0WgGfH6AZ8BjFhSwppwsQ@mail.gmail.com>
	<CAEG8a3JjP1LaKSv-r3AMJLRyLMzENJrKshWsDvDouMPM_sizmA@mail.gmail.com>
	<CA+HiwqFQ+ZA7hSOygv4uv_t75B3r0_gosjadetCsAEoaZwTu6g@mail.gmail.com>
	<CA+HiwqHdB0r8z6Mut13BxpYNq2W-os+Arju4mcZbCyU9PeaVog@mail.gmail.com>
	<CAEG8a3K5ayVNkCDnK9OtNb+4OY0chJtW6ObgEOSFjqyymQda8Q@mail.gmail.com>
	<CA+HiwqGJYCgEs_F-LBtrRdx-Y969LMr-OVogjFXU6U-0q5bOwQ@mail.gmail.com>
	<CA+HiwqF2Ma6R_QyjfPBvFreYbezFpGcwASJE0a2bM+Y0jvof+g@mail.gmail.com>
	<[email protected]>
	<CA+HiwqGFRAH2O5bGpNMErFopFyn-2-Zu3+5y+BFQim9TE8z+Pw@mail.gmail.com>
	<[email protected]>
	<CA+HiwqFx=aciJYkkaviyTutUm303QXz6GtXSqzG7nfd4MAzddQ@mail.gmail.com>
	<CA+HiwqH39QU7vGVx65JH1e0nzVvQc5eVmuY7=qyj0T_+b-HO3A@mail.gmail.com>
	<[email protected]>
	<CA+HiwqF+jAyHUiNzpR+vRBbpeCiVAdFU52-ffTGko9Zit317oA@mail.gmail.com>
	<CAEg7pwcKf01FmDqFAf-Hzu_pYnMYScY_Otid-pe9uw3BJ6gq9g@mail.gmail.com>
	<CA+HiwqFK8rXTNknxV0MMe++7W08g3kY6eeLK21A1xCrK2Wuk8Q@mail.gmail.com>

On Wed, Apr 8, 2026 at 10:23 AM Amit Langote <[email protected]> wrote:
> On Tue, Apr 7, 2026 at 10:00 PM Evan Montgomery-Recht
> <[email protected]> wrote:
> > The patch also adds a test module (test_spi_func) with a C function
> > that executes SQL via SPI_connect/SPI_execute/SPI_finish, since this
> > crash cannot be triggered from PL/pgSQL. The test exercises the
> > C-level SPI INSERT with multiple FK constraints, FK violations, and
> > nested PL/pgSQL-calls-C-SPI (matching the PostGIS call pattern).

I applied only the test module changes and it passes (without
crashing) even without your proposed fix. It seems that's because the
C function in test_spi_func calling SPI is using the same resource
owner as the parent SELECT. I think you'd need to create a resource
owner manually in the spi_exec() C function to reproduce the crash, as
done in the attached 0001, which contains the src/test changes
extracted from your patch modified as described, including renaming
the C function to spi_exec_sql().

Also, the test cases that call spi_exec() (_sql()) directly from a
SELECT don't actually exercise the crash path because there is no
outer trigger-firing loop active. query_depth is 0 inside the inner
SPI's AfterTriggerEndQuery, so the old guard wouldn't suppress the
callback there anyway. The critical case requires spi_exec_sql() to be
called from inside an AFTER trigger, where query_depth > 0 causes the
guard to defer the callback past the inner resource owner's lifetime.
I've added that test case. I kept your original test cases as they
still provide useful coverage of C-level SPI FK behavior even if they
don't exercise the crash path specifically.  Maybe your original
PostGIS test suite that hit the crash did have the right structure,
but that's not reflected in the patch as far as I can tell.

I've also renamed the module to test_spi_resowner to better reflect
what it's about.

For the fix, I have a different proposal. As you observed, the
query_depth > 0 early return in FireAfterTriggerBatchCallbacks() means
that the nested SPI's callbacks get called under the outer resource
owner, which may not be the same as the one that SPI used. I think it
was a mistake to have that early return in the first place. Instead we
could remember for each callback what firing level it should be called
at, so the nested SPI's callbacks fire before returning to the parent
level and parent-level callbacks fire when the parent level completes.
I have implemented that in the attached 0002 along with transaction
boundary cleanup of callbacks, which passes the check-world for me,
but I'll need to stare some more at it before committing.

Let me know if this also fixes your own in-house test suite or if you
have any other suggestions or if you think I am missing something.

--
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v2-0001-Modified-test-suite-from-Evan-s-patch.patch (17.1K, 2-v2-0001-Modified-test-suite-from-Evan-s-patch.patch)
  download | inline diff:
From 2da74146aafbd9d505e9e0d9038138bc46f0cd08 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 8 Apr 2026 13:53:08 +0900
Subject: [PATCH v2 1/2] Modified test suite from Evan's patch

The C function in the original test module was using the same resource
owner as the parent SELECT, so the crash could not be reproduced.
Added a dedicated resource owner around the SPI call to ensure the inner
resource owner is released before the outer trigger-firing batch callback
fires, which is necessary to trigger the crash this test is meant to
cover.
---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 src/test/modules/test_spi_resowner/Makefile   |  23 ++++
 .../expected/ri_fastpath.out                  | 118 ++++++++++++++++++
 .../modules/test_spi_resowner/meson.build     |  31 +++++
 .../test_spi_resowner/sql/ri_fastpath.sql     | 107 ++++++++++++++++
 .../test_spi_resowner--1.0.sql                |   9 ++
 .../test_spi_resowner/test_spi_resowner.c     |  70 +++++++++++
 .../test_spi_resowner.control                 |   4 +
 9 files changed, 364 insertions(+)
 create mode 100644 src/test/modules/test_spi_resowner/Makefile
 create mode 100644 src/test/modules/test_spi_resowner/expected/ri_fastpath.out
 create mode 100644 src/test/modules/test_spi_resowner/meson.build
 create mode 100644 src/test/modules/test_spi_resowner/sql/ri_fastpath.sql
 create mode 100644 src/test/modules/test_spi_resowner/test_spi_resowner--1.0.sql
 create mode 100644 src/test/modules/test_spi_resowner/test_spi_resowner.c
 create mode 100644 src/test/modules/test_spi_resowner/test_spi_resowner.control

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 0a74ab5c86f..016b328c8c5 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -52,6 +52,7 @@ SUBDIRS = \
 		  test_shmem \
 		  test_shm_mq \
 		  test_slru \
+		  test_spi_resowner \
 		  test_tidstore \
 		  unsafe_tests \
 		  worker_spi \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 4bca42bb370..3ca454064d0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -53,6 +53,7 @@ subdir('test_saslprep')
 subdir('test_shmem')
 subdir('test_shm_mq')
 subdir('test_slru')
+subdir('test_spi_resowner')
 subdir('test_tidstore')
 subdir('typcache')
 subdir('unsafe_tests')
diff --git a/src/test/modules/test_spi_resowner/Makefile b/src/test/modules/test_spi_resowner/Makefile
new file mode 100644
index 00000000000..5a69e3a3c42
--- /dev/null
+++ b/src/test/modules/test_spi_resowner/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/test_spi_resowner/Makefile
+
+MODULE_big = test_spi_resowner
+OBJS = \
+	$(WIN32RES) \
+	test_spi_resowner.o
+PGFILEDESC = "test_spi_resowner - SQL-callable C SPI function under a dedicated ResourceOwner"
+
+EXTENSION = test_spi_resowner
+DATA = test_spi_resowner--1.0.sql
+
+REGRESS = ri_fastpath
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_spi_resowner
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_spi_resowner/expected/ri_fastpath.out b/src/test/modules/test_spi_resowner/expected/ri_fastpath.out
new file mode 100644
index 00000000000..ad6b0f7c9b3
--- /dev/null
+++ b/src/test/modules/test_spi_resowner/expected/ri_fastpath.out
@@ -0,0 +1,118 @@
+--
+-- Test RI fast-path FK check under C-level SPI.
+--
+-- The RI fast-path caches PK relation references in ri_FastPathGetEntry()
+-- under the current resource owner.  When FK triggers fire inside a
+-- C-level SPI context that creates a dedicated short-lived resource owner,
+-- those references must be released before the inner resource owner is
+-- released.  The fix ensures batch callbacks fire at the same firing depth
+-- at which they were registered, while the corresponding resource owner
+-- is still alive.  Without this, ri_FastPathTeardown would crash with
+-- Assert(rel->rd_refcnt > 0) in index_close.
+--
+-- Simple PL/pgSQL does not trigger this because its SPI connection spans
+-- the entire function call, so its resource owner outlives the batch
+-- callback.  The critical test case requires a C function that creates a
+-- dedicated short-lived resource owner around its SPI call.
+--
+CREATE EXTENSION test_spi_resowner;
+CREATE TABLE ri_fp_pk1 (id serial PRIMARY KEY);
+CREATE TABLE ri_fp_pk2 (id serial PRIMARY KEY);
+CREATE TABLE ri_fp_pk3 (id serial PRIMARY KEY);
+INSERT INTO ri_fp_pk1 VALUES (1);
+INSERT INTO ri_fp_pk2 VALUES (1);
+INSERT INTO ri_fp_pk3 VALUES (1);
+CREATE TABLE ri_fp_fk (
+    id serial PRIMARY KEY,
+    a int REFERENCES ri_fp_pk1(id),
+    b int REFERENCES ri_fp_pk2(id),
+    c int REFERENCES ri_fp_pk3(id),
+    d int REFERENCES ri_fp_pk1(id),
+    e int REFERENCES ri_fp_pk2(id),
+    f int REFERENCES ri_fp_pk3(id)
+);
+-- C-level SPI INSERT: the critical test case.
+-- Without the fix this crashes the server.
+SELECT spi_exec_sql(
+    'INSERT INTO ri_fp_fk (a, b, c, d, e, f) VALUES (1, 1, 1, 1, 1, 1)');
+ spi_exec_sql 
+--------------
+ 
+(1 row)
+
+-- Additional C-level SPI INSERTs to exercise batch reuse across calls.
+-- Use different column orderings to ensure each is a distinct statement.
+SELECT spi_exec_sql(
+    'INSERT INTO ri_fp_fk (f, e, d, c, b, a) VALUES (1, 1, 1, 1, 1, 1)');
+ spi_exec_sql 
+--------------
+ 
+(1 row)
+
+SELECT spi_exec_sql(
+    'INSERT INTO ri_fp_fk (a, c, e, b, d, f) VALUES (1, 1, 1, 1, 1, 1)');
+ spi_exec_sql 
+--------------
+ 
+(1 row)
+
+-- C-level SPI with FK violation: should error, not crash
+SELECT spi_exec_sql(
+    'INSERT INTO ri_fp_fk (a, b, c, d, e, f) VALUES (999, 1, 1, 1, 1, 1)');
+ERROR:  insert or update on table "ri_fp_fk" violates foreign key constraint "ri_fp_fk_a_fkey"
+DETAIL:  Key (a)=(999) is not present in table "ri_fp_pk1".
+CONTEXT:  SQL statement "INSERT INTO ri_fp_fk (a, b, c, d, e, f) VALUES (999, 1, 1, 1, 1, 1)"
+-- Nested: PL/pgSQL calling C SPI (mimics PostGIS toTopoGeom pattern)
+CREATE FUNCTION plpgsql_calls_c_spi() RETURNS void AS $$
+DECLARE
+    ins_stmt text := 'INSERT INTO ri_fp_fk (a, b, c, d, e, f) VALUES (1, 1, 1, 1, 1, 1)';
+BEGIN
+    PERFORM spi_exec_sql(ins_stmt);
+END;
+$$ LANGUAGE plpgsql;
+SELECT plpgsql_calls_c_spi();
+ plpgsql_calls_c_spi 
+---------------------
+ 
+(1 row)
+
+-- AFTER trigger that uses C-level SPI to insert into an FK-referencing table.
+-- The FK batch callback is registered at the inner SPI's firing depth and
+-- must fire before the inner resource owner is released.  This exercises
+-- the depth-matched callback firing introduced to fix that crash.
+CREATE TABLE ri_fp_outer (id int PRIMARY KEY);
+CREATE TABLE ri_fp_inner (id int REFERENCES ri_fp_pk1(id));
+CREATE FUNCTION outer_trigger_spi_ok() RETURNS trigger AS $$
+BEGIN
+    PERFORM spi_exec_sql('INSERT INTO ri_fp_inner VALUES (1)');
+    RETURN NEW;
+END $$ LANGUAGE plpgsql;
+CREATE TRIGGER outer_tg AFTER INSERT ON ri_fp_outer
+    FOR EACH ROW EXECUTE FUNCTION outer_trigger_spi_ok();
+-- Fires outer_tg, whose PL/pgSQL body calls spi_exec_sql().  The C function
+-- creates a dedicated resource owner; the FK batch callback fires at the
+-- inner SPI's firing depth before that resource owner is released.
+INSERT INTO ri_fp_outer VALUES (1);
+CREATE FUNCTION outer_trigger_spi_fail() RETURNS trigger AS $$
+BEGIN
+    PERFORM spi_exec_sql('INSERT INTO ri_fp_inner VALUES (3)');
+    RETURN NEW;
+END $$ LANGUAGE plpgsql;
+DROP TRIGGER outer_tg ON ri_fp_outer;
+DROP FUNCTION outer_trigger_spi_ok();
+CREATE TRIGGER outer_tg AFTER INSERT ON ri_fp_outer
+    FOR EACH ROW EXECUTE FUNCTION outer_trigger_spi_fail();
+--  Like above but the inner insert fails.
+INSERT INTO ri_fp_outer VALUES (2);
+ERROR:  insert or update on table "ri_fp_inner" violates foreign key constraint "ri_fp_inner_id_fkey"
+DETAIL:  Key (id)=(3) is not present in table "ri_fp_pk1".
+CONTEXT:  SQL statement "INSERT INTO ri_fp_inner VALUES (3)"
+SQL statement "SELECT spi_exec_sql('INSERT INTO ri_fp_inner VALUES (3)')"
+PL/pgSQL function outer_trigger_spi_fail() line 3 at PERFORM
+DROP TRIGGER outer_tg ON ri_fp_outer;
+DROP FUNCTION outer_trigger_spi_fail();
+DROP TABLE ri_fp_inner, ri_fp_outer;
+-- Cleanup
+DROP TABLE ri_fp_fk;
+DROP TABLE ri_fp_pk3, ri_fp_pk2, ri_fp_pk1;
+DROP EXTENSION test_spi_resowner;
diff --git a/src/test/modules/test_spi_resowner/meson.build b/src/test/modules/test_spi_resowner/meson.build
new file mode 100644
index 00000000000..fbb027e05c7
--- /dev/null
+++ b/src/test/modules/test_spi_resowner/meson.build
@@ -0,0 +1,31 @@
+test_spi_resowner_sources = files(
+  'test_spi_resowner.c',
+)
+
+if host_system == 'windows'
+  test_spi_resowner_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'test_spi_resowner',
+    '--FILEDESC', 'test_spi_resowner - SQL-callable C SPI function under a dedicated ResourceOwner',])
+endif
+
+test_spi_resowner = shared_module('test_spi_resowner',
+  test_spi_resowner_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += test_spi_resowner
+
+test_install_data += files(
+  'test_spi_resowner.control',
+  'test_spi_resowner--1.0.sql',
+)
+
+tests += {
+  'name': 'test_spi_resowner',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'regress': {
+    'sql': [
+      'ri_fastpath',
+    ],
+  },
+}
diff --git a/src/test/modules/test_spi_resowner/sql/ri_fastpath.sql b/src/test/modules/test_spi_resowner/sql/ri_fastpath.sql
new file mode 100644
index 00000000000..4517b2437c4
--- /dev/null
+++ b/src/test/modules/test_spi_resowner/sql/ri_fastpath.sql
@@ -0,0 +1,107 @@
+--
+-- Test RI fast-path FK check under C-level SPI.
+--
+-- The RI fast-path caches PK relation references in ri_FastPathGetEntry()
+-- under the current resource owner.  When FK triggers fire inside a
+-- C-level SPI context that creates a dedicated short-lived resource owner,
+-- those references must be released before the inner resource owner is
+-- released.  The fix ensures batch callbacks fire at the same firing depth
+-- at which they were registered, while the corresponding resource owner
+-- is still alive.  Without this, ri_FastPathTeardown would crash with
+-- Assert(rel->rd_refcnt > 0) in index_close.
+--
+-- Simple PL/pgSQL does not trigger this because its SPI connection spans
+-- the entire function call, so its resource owner outlives the batch
+-- callback.  The critical test case requires a C function that creates a
+-- dedicated short-lived resource owner around its SPI call.
+--
+CREATE EXTENSION test_spi_resowner;
+
+CREATE TABLE ri_fp_pk1 (id serial PRIMARY KEY);
+CREATE TABLE ri_fp_pk2 (id serial PRIMARY KEY);
+CREATE TABLE ri_fp_pk3 (id serial PRIMARY KEY);
+INSERT INTO ri_fp_pk1 VALUES (1);
+INSERT INTO ri_fp_pk2 VALUES (1);
+INSERT INTO ri_fp_pk3 VALUES (1);
+
+CREATE TABLE ri_fp_fk (
+    id serial PRIMARY KEY,
+    a int REFERENCES ri_fp_pk1(id),
+    b int REFERENCES ri_fp_pk2(id),
+    c int REFERENCES ri_fp_pk3(id),
+    d int REFERENCES ri_fp_pk1(id),
+    e int REFERENCES ri_fp_pk2(id),
+    f int REFERENCES ri_fp_pk3(id)
+);
+
+-- C-level SPI INSERT: the critical test case.
+-- Without the fix this crashes the server.
+SELECT spi_exec_sql(
+    'INSERT INTO ri_fp_fk (a, b, c, d, e, f) VALUES (1, 1, 1, 1, 1, 1)');
+
+-- Additional C-level SPI INSERTs to exercise batch reuse across calls.
+-- Use different column orderings to ensure each is a distinct statement.
+SELECT spi_exec_sql(
+    'INSERT INTO ri_fp_fk (f, e, d, c, b, a) VALUES (1, 1, 1, 1, 1, 1)');
+SELECT spi_exec_sql(
+    'INSERT INTO ri_fp_fk (a, c, e, b, d, f) VALUES (1, 1, 1, 1, 1, 1)');
+
+-- C-level SPI with FK violation: should error, not crash
+SELECT spi_exec_sql(
+    'INSERT INTO ri_fp_fk (a, b, c, d, e, f) VALUES (999, 1, 1, 1, 1, 1)');
+
+-- Nested: PL/pgSQL calling C SPI (mimics PostGIS toTopoGeom pattern)
+CREATE FUNCTION plpgsql_calls_c_spi() RETURNS void AS $$
+DECLARE
+    ins_stmt text := 'INSERT INTO ri_fp_fk (a, b, c, d, e, f) VALUES (1, 1, 1, 1, 1, 1)';
+BEGIN
+    PERFORM spi_exec_sql(ins_stmt);
+END;
+$$ LANGUAGE plpgsql;
+
+SELECT plpgsql_calls_c_spi();
+
+-- AFTER trigger that uses C-level SPI to insert into an FK-referencing table.
+-- The FK batch callback is registered at the inner SPI's firing depth and
+-- must fire before the inner resource owner is released.  This exercises
+-- the depth-matched callback firing introduced to fix that crash.
+CREATE TABLE ri_fp_outer (id int PRIMARY KEY);
+CREATE TABLE ri_fp_inner (id int REFERENCES ri_fp_pk1(id));
+
+CREATE FUNCTION outer_trigger_spi_ok() RETURNS trigger AS $$
+BEGIN
+    PERFORM spi_exec_sql('INSERT INTO ri_fp_inner VALUES (1)');
+    RETURN NEW;
+END $$ LANGUAGE plpgsql;
+
+CREATE TRIGGER outer_tg AFTER INSERT ON ri_fp_outer
+    FOR EACH ROW EXECUTE FUNCTION outer_trigger_spi_ok();
+
+-- Fires outer_tg, whose PL/pgSQL body calls spi_exec_sql().  The C function
+-- creates a dedicated resource owner; the FK batch callback fires at the
+-- inner SPI's firing depth before that resource owner is released.
+INSERT INTO ri_fp_outer VALUES (1);
+
+CREATE FUNCTION outer_trigger_spi_fail() RETURNS trigger AS $$
+BEGIN
+    PERFORM spi_exec_sql('INSERT INTO ri_fp_inner VALUES (3)');
+    RETURN NEW;
+END $$ LANGUAGE plpgsql;
+
+DROP TRIGGER outer_tg ON ri_fp_outer;
+DROP FUNCTION outer_trigger_spi_ok();
+
+CREATE TRIGGER outer_tg AFTER INSERT ON ri_fp_outer
+    FOR EACH ROW EXECUTE FUNCTION outer_trigger_spi_fail();
+
+--  Like above but the inner insert fails.
+INSERT INTO ri_fp_outer VALUES (2);
+
+DROP TRIGGER outer_tg ON ri_fp_outer;
+DROP FUNCTION outer_trigger_spi_fail();
+DROP TABLE ri_fp_inner, ri_fp_outer;
+
+-- Cleanup
+DROP TABLE ri_fp_fk;
+DROP TABLE ri_fp_pk3, ri_fp_pk2, ri_fp_pk1;
+DROP EXTENSION test_spi_resowner;
diff --git a/src/test/modules/test_spi_resowner/test_spi_resowner--1.0.sql b/src/test/modules/test_spi_resowner/test_spi_resowner--1.0.sql
new file mode 100644
index 00000000000..29ef70ee0dc
--- /dev/null
+++ b/src/test/modules/test_spi_resowner/test_spi_resowner--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/test_spi_resowner/test_spi_resowner--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_spi_resowner" to load this file. \quit
+
+CREATE FUNCTION spi_exec_sql(query text)
+RETURNS void
+AS 'MODULE_PATHNAME', 'spi_exec_sql'
+LANGUAGE C STRICT;
diff --git a/src/test/modules/test_spi_resowner/test_spi_resowner.c b/src/test/modules/test_spi_resowner/test_spi_resowner.c
new file mode 100644
index 00000000000..0306139b5c0
--- /dev/null
+++ b/src/test/modules/test_spi_resowner/test_spi_resowner.c
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_spi_resowner.c
+ *		SQL-callable C function that uses SPI to execute a query.
+ *
+ *		Useful for testing code paths that only trigger under C-level
+ *		SPI (not PL/pgSQL), such as resource owner interactions with
+ *		RI fast-path FK checks.
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_spi_resowner/test_spi_resowner.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "executor/spi.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(spi_exec_sql);
+
+/*
+ * spi_exec_sql(query text) - execute a SQL query via SPI.
+ *
+ * Opens a fresh SPI connection, executes the query, and closes the
+ * connection.  Creates a dedicated child resource owner around the
+ * SPI_execute call and releases it before returning, ensuring that
+ * any resources registered under it (such as relation references
+ * opened by RI fast-path FK checks) are released before the outer
+ * trigger-firing batch callback fires.  This reproduces the resource
+ * owner mismatch that occurs with C-language extensions like PostGIS
+ * topology functions, which cannot be triggered from PL/pgSQL since
+ * PL/pgSQL's SPI connection spans the entire function call.
+ */
+Datum
+spi_exec_sql(PG_FUNCTION_ARGS)
+{
+	const char *query = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	int			ret;
+	ResourceOwner save = CurrentResourceOwner;
+	ResourceOwner childowner = ResourceOwnerCreate(save, "test_spi inner");
+
+	SPI_connect();
+
+	CurrentResourceOwner = childowner;
+	ret = SPI_execute(query, false, 0);
+
+	if (ret < 0)
+		elog(ERROR, "SPI_execute failed: error code %d", ret);
+
+	SPI_finish();
+
+	CurrentResourceOwner = save;
+	ResourceOwnerRelease(childowner,
+						 RESOURCE_RELEASE_BEFORE_LOCKS,
+						 true, false);
+	ResourceOwnerRelease(childowner,
+						 RESOURCE_RELEASE_LOCKS,
+						 true, false);
+	ResourceOwnerRelease(childowner,
+						 RESOURCE_RELEASE_AFTER_LOCKS,
+						 true, false);
+	ResourceOwnerDelete(childowner);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_spi_resowner/test_spi_resowner.control b/src/test/modules/test_spi_resowner/test_spi_resowner.control
new file mode 100644
index 00000000000..2120ae9442f
--- /dev/null
+++ b/src/test/modules/test_spi_resowner/test_spi_resowner.control
@@ -0,0 +1,4 @@
+comment = 'Test SQL-callable C function that uses SPI using dedicated ResourceOwner'
+default_version = '1.0'
+module_pathname = '$libdir/test_spi_resowner'
+relocatable = true
-- 
2.47.3



  [application/octet-stream] v2-0002-Fix-RI-fast-path-crash-under-nested-C-level-SPI.patch (5.6K, 3-v2-0002-Fix-RI-fast-path-crash-under-nested-C-level-SPI.patch)
  download | inline diff:
From 312fad1c36e064ab9e7dc1780575e8c07f300751 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 8 Apr 2026 18:17:40 +0900
Subject: [PATCH v2 2/2] Fix RI fast-path crash under nested C-level SPI

When a C-language function uses SPI_connect/SPI_execute/SPI_finish to
INSERT into a table with FK constraints, the FK AFTER triggers
register ri_FastPathEndBatch as a batch callback and open PK relations
under the SPI portal's resource owner.  FireAfterTriggerBatchCallbacks
was suppressed at that point by the query_depth > 0 guard, deferring
teardown to the outer query's AfterTriggerEndQuery.  By then the SPI
portal's resource owner had been released, decrementing the cached
relations' refcounts to zero.  ri_FastPathTeardown then crashed
attempting to close them:

  TRAP: failed Assert("rel->rd_refcnt > 0")

Fix by tagging each AfterTriggerCallbackItem with the
afterTriggerFiringDepth (added in 5c54c3ed1b9) at registration time
and firing only callbacks whose depth matches the current depth.  This
replaces the query_depth > 0 suppression guard. Callbacks now fire at
the same firing depth at which they were registered, while the
resource owner that was active during registration is still alive,
eliminating the mismatch.

While at it, ensure callbacks are properly accounted for at all
transaction boundaries, as cleanup of b7b27eb41a5c: assert on commit
that no callbacks remain unfired, and discard any remaining callbacks
on transaction abort.  Also restructure FireAfterTriggerBatchCallbacks()
to update afterTriggers.batch_callbacks before invoking any callbacks,
so that if a callback throws an ERROR the list is already in a
consistent state.

Note that ri_PerformCheck() uses fire_triggers=false, which skips
AfterTriggerBeginQuery/EndQuery and thus never increments
afterTriggerFiringDepth; events queued there fire at the outer
query's depth and are unaffected by this change.

Reported-by: Evan Montgomery-Recht <[email protected]>
Author: Evan Montgomery-Recht <[email protected]>
Co-authored-by: Amit Langote <[email protected]>
Discussion: https://postgr.es/m/CAEg7pwcKf01FmDqFAf-Hzu_pYnMYScY_Otid-pe9uw3BJ6gq9g@mail.gmail.com
---
 src/backend/commands/trigger.c | 54 +++++++++++++++++++++++++++-------
 1 file changed, 43 insertions(+), 11 deletions(-)

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index c41005ba44e..f59537fe86e 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3935,6 +3935,8 @@ struct AfterTriggersTableData
 typedef struct AfterTriggerCallbackItem
 {
 	AfterTriggerBatchCallback callback;
+	int			firing_depth;	/* afterTriggerFiringDepth when registered;
+								 * callback fires only at this depth */
 	void	   *arg;
 } AfterTriggerCallbackItem;
 
@@ -5419,6 +5421,15 @@ AfterTriggerEndXact(bool isCommit)
 	afterTriggers.query_depth = -1;
 
 	afterTriggerFiringDepth = 0;
+
+	Assert(afterTriggers.batch_callbacks == NIL || !isCommit);
+
+	/* On abort, discard any pending callbacks without firing them. */
+	if (!isCommit)
+	{
+		list_free_deep(afterTriggers.batch_callbacks);
+		afterTriggers.batch_callbacks = NIL;
+	}
 }
 
 /*
@@ -6830,6 +6841,7 @@ RegisterAfterTriggerBatchCallback(AfterTriggerBatchCallback callback,
 	oldcxt = MemoryContextSwitchTo(TopTransactionContext);
 	item = palloc(sizeof(AfterTriggerCallbackItem));
 	item->callback = callback;
+	item->firing_depth = afterTriggerFiringDepth;
 	item->arg = arg;
 	afterTriggers.batch_callbacks =
 		lappend(afterTriggers.batch_callbacks, item);
@@ -6838,31 +6850,51 @@ RegisterAfterTriggerBatchCallback(AfterTriggerBatchCallback callback,
 
 /*
  * FireAfterTriggerBatchCallbacks
- *		Invoke and clear all registered batch callbacks.
+ *		Invoke callbacks registered at the current firing depth.
+ *
+ * Each callback is tagged with the afterTriggerFiringDepth at registration
+ * time.  Only callbacks matching the current depth are invoked; the rest
+ * are retained for when their own depth fires.  This ensures that nested
+ * trigger-firing contexts (e.g., SPI calls inside AFTER triggers) only
+ * fire the callbacks they registered, leaving outer-level callbacks intact
+ * until their firing depth is reached.
  *
- * Only fires at the outermost query level (query_depth == 0) or from
- * top-level operations (query_depth == -1, e.g. AfterTriggerFireDeferred
- * at COMMIT).  Nested queries from SPI inside AFTER triggers run at
- * depth > 0 and must not tear down resources the outer batch still needs.
+ * The list is updated before any callbacks are invoked so that if a
+ * callback throws an ERROR the list is already in a consistent state.
  */
 static void
 FireAfterTriggerBatchCallbacks(void)
 {
+	List	   *remaining = NIL;
+	List	   *to_fire = NIL;
 	ListCell   *lc;
 
-	if (afterTriggers.query_depth > 0)
-		return;
+	/* remaining and to_fire lists must survive until callbacks complete */
+	MemoryContext oldcxt = MemoryContextSwitchTo(TopTransactionContext);
 
-	Assert(afterTriggerFiringDepth > 0);
 	foreach(lc, afterTriggers.batch_callbacks)
 	{
 		AfterTriggerCallbackItem *item = lfirst(lc);
 
-		item->callback(item->arg);
+		if (item->firing_depth == afterTriggerFiringDepth)
+			to_fire = lappend(to_fire, item);
+		else
+			remaining = lappend(remaining, item);
 	}
 
-	list_free_deep(afterTriggers.batch_callbacks);
-	afterTriggers.batch_callbacks = NIL;
+	list_free(afterTriggers.batch_callbacks);
+	afterTriggers.batch_callbacks = remaining;
+	MemoryContextSwitchTo(oldcxt);
+
+	/* Now fire; if one throws, the list is already clean */
+	foreach(lc, to_fire)
+	{
+		AfterTriggerCallbackItem *item = lfirst(lc);
+
+		item->callback(item->arg);
+		pfree(item);
+	}
+	list_free(to_fire);
 }
 
 /*
-- 
2.47.3



view thread (63+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
  In-Reply-To: <CA+HiwqFdkJGBU9QmYkm6056SuhunGk7yFCuVP=2vBejJa+qhGg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox