public inbox for [email protected]  
help / color / mirror / Atom feed
From: jie wang <[email protected]>
To: Amit Langote <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Evan Montgomery-Recht <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
Date: Thu, 9 Apr 2026 17:21:51 +0800
Message-ID: <CAJnZyeDyaS=X-eYN=9rDYqK=6ma1gMLa0qDgfNbZKK0e0+q99Q@mail.gmail.com> (raw)
In-Reply-To: <CA+HiwqFWLR01NjK5Y+MKiyaqg64ThVS7UYKK3ZBVNHvtm=3-ug@mail.gmail.com>
References: <CA+HiwqF4C0ws3cO+z5cLkPuvwnAwkSp7sfvgGj3yQ=Li6KNMqA@mail.gmail.com>
	<CA+HiwqGM6nvAV5O+=Nr+BXMPWOma0oeCRVzVP0XiLE8zX5TVAg@mail.gmail.com>
	<CA+HiwqGMaovCUgDbGxVGnK0Mrivr+ph3YE2Ws+47-ugyPb4f7g@mail.gmail.com>
	<CAFj8pRDaiBe_GOLk_yyYHTtPiDAAaLOM8u1-=Q3ZgXBTH+1igg@mail.gmail.com>
	<CA+HiwqGA5Ay_MR0eJEEbt4j6WrVh4F+AasTp8yCbs5aJLOJn6Q@mail.gmail.com>
	<CAEG8a3JM=NoqiTK0V6S9FNxZPvy1+C5F7rfafTtPKBVWnunL-g@mail.gmail.com>
	<CA+HiwqEyiLCY6MTLbOJXDdLNNQLaURYHvdW797MQgbjEK9od4Q@mail.gmail.com>
	<CAEG8a3+VBpwPf1Rm-ECD90whM9b3YnGhux5CVXdsL6khiBfzRQ@mail.gmail.com>
	<CA+HiwqF2UHzF0sKCp-F2a-U29rqh_9ZPy=f1h+Fh_=M8efj3pg@mail.gmail.com>
	<CAEG8a3L9Ew-WL8sxLROVOcypeaENPmd8qCmMvz4geoGL1TDGCA@mail.gmail.com>
	<CAEG8a3+nUFQo4sdPQF9xy0J73J8RFJ5U9A5+_kMosGDaZ+1sXA@mail.gmail.com>
	<[email protected]>
	<CAEG8a3JyKdizWvYsF+z_mA1BKy=dpW11iKVMOG=bk6Tbz6M1Bw@mail.gmail.com>
	<CAEG8a3+Hf4tvvbts29_k_AFhWQmRYfEo_SW4C5FY_140iKghBw@mail.gmail.com>
	<CA+HiwqFV-PY-3BxM6j5TaAiC3AwedDxo-6vwRSbvygg3zF+xAQ@mail.gmail.com>
	<CA+HiwqHpaisS-e+0gAgzh92qZAFxncAJMmmTRZZN=efoeTPgOg@mail.gmail.com>
	<CA+HiwqFwZ6WLRbY8R7VC7JVp5Jot6ktZOkr9wDxTjoK=W1SgdQ@mail.gmail.com>
	<CA+HiwqF_Kz=R8juHJBiOATvabWSOugK4VaGOxoJ_n=E2c7UM9w@mail.gmail.com>
	<CA+HiwqHCB7kcbspkhaLN9enoj5x=ehzhFM4PXDgWUUP8Px41GA@mail.gmail.com>
	<[email protected]>
	<CA+HiwqHpVtP485wEKuXdOkdoZWhvVvfFH40-og07Jp3MPx21eg@mail.gmail.com>
	<CAEG8a3JWHkJSXe9nNcVK7wnYKUEqWuMGFDhy5BynB_9tEjmEUg@mail.gmail.com>
	<CA+HiwqFjfumKrWy03q5M309xJJVYt0WgGfH6AZ8BjFhSwppwsQ@mail.gmail.com>
	<CAEG8a3JjP1LaKSv-r3AMJLRyLMzENJrKshWsDvDouMPM_sizmA@mail.gmail.com>
	<CA+HiwqFQ+ZA7hSOygv4uv_t75B3r0_gosjadetCsAEoaZwTu6g@mail.gmail.com>
	<CA+HiwqHdB0r8z6Mut13BxpYNq2W-os+Arju4mcZbCyU9PeaVog@mail.gmail.com>
	<CAEG8a3K5ayVNkCDnK9OtNb+4OY0chJtW6ObgEOSFjqyymQda8Q@mail.gmail.com>
	<CA+HiwqGJYCgEs_F-LBtrRdx-Y969LMr-OVogjFXU6U-0q5bOwQ@mail.gmail.com>
	<CA+HiwqF2Ma6R_QyjfPBvFreYbezFpGcwASJE0a2bM+Y0jvof+g@mail.gmail.com>
	<[email protected]>
	<CA+HiwqGFRAH2O5bGpNMErFopFyn-2-Zu3+5y+BFQim9TE8z+Pw@mail.gmail.com>
	<[email protected]>
	<CA+HiwqFx=aciJYkkaviyTutUm303QXz6GtXSqzG7nfd4MAzddQ@mail.gmail.com>
	<CA+HiwqH39QU7vGVx65JH1e0nzVvQc5eVmuY7=qyj0T_+b-HO3A@mail.gmail.com>
	<[email protected]>
	<CA+HiwqF+jAyHUiNzpR+vRBbpeCiVAdFU52-ffTGko9Zit317oA@mail.gmail.com>
	<CAEg7pwcKf01FmDqFAf-Hzu_pYnMYScY_Otid-pe9uw3BJ6gq9g@mail.gmail.com>
	<CA+HiwqFK8rXTNknxV0MMe++7W08g3kY6eeLK21A1xCrK2Wuk8Q@mail.gmail.com>
	<CA+HiwqFdkJGBU9QmYkm6056SuhunGk7yFCuVP=2vBejJa+qhGg@mail.gmail.com>
	<CA+HiwqFt4NGTNk7BinOsHHM48E9zGAa852vCfGoSe1bbL=JNFQ@mail.gmail.com>
	<[email protected]>
	<CA+HiwqFWLR01NjK5Y+MKiyaqg64ThVS7UYKK3ZBVNHvtm=3-ug@mail.gmail.com>

Amit Langote <[email protected]> 于2026年4月9日周四 16:41写道:

> Hi,
>
> On Thu, Apr 9, 2026 at 4:40 PM Chao Li <[email protected]> wrote:
> > > On Apr 8, 2026, at 22:26, Amit Langote <[email protected]>
> wrote:
> > > On Wed, Apr 8, 2026 at 6:58 PM Amit Langote <[email protected]>
> wrote:
> > >> On Wed, Apr 8, 2026 at 10:23 AM Amit Langote <[email protected]>
> wrote:
> > >>> On Tue, Apr 7, 2026 at 10:00 PM Evan Montgomery-Recht
> > >>> <[email protected]> wrote:
> > >>>> The patch also adds a test module (test_spi_func) with a C function
> > >>>> that executes SQL via SPI_connect/SPI_execute/SPI_finish, since this
> > >>>> crash cannot be triggered from PL/pgSQL. The test exercises the
> > >>>> C-level SPI INSERT with multiple FK constraints, FK violations, and
> > >>>> nested PL/pgSQL-calls-C-SPI (matching the PostGIS call pattern).
> > >>
> > >> I applied only the test module changes and it passes (without
> > >> crashing) even without your proposed fix. It seems that's because the
> > >> C function in test_spi_func calling SPI is using the same resource
> > >> owner as the parent SELECT. I think you'd need to create a resource
> > >> owner manually in the spi_exec() C function to reproduce the crash, as
> > >> done in the attached 0001, which contains the src/test changes
> > >> extracted from your patch modified as described, including renaming
> > >> the C function to spi_exec_sql().
> > >>
> > >> Also, the test cases that call spi_exec() (_sql()) directly from a
> > >> SELECT don't actually exercise the crash path because there is no
> > >> outer trigger-firing loop active. query_depth is 0 inside the inner
> > >> SPI's AfterTriggerEndQuery, so the old guard wouldn't suppress the
> > >> callback there anyway. The critical case requires spi_exec_sql() to be
> > >> called from inside an AFTER trigger, where query_depth > 0 causes the
> > >> guard to defer the callback past the inner resource owner's lifetime.
> > >> I've added that test case. I kept your original test cases as they
> > >> still provide useful coverage of C-level SPI FK behavior even if they
> > >> don't exercise the crash path specifically.  Maybe your original
> > >> PostGIS test suite that hit the crash did have the right structure,
> > >> but that's not reflected in the patch as far as I can tell.
> > >>
> > >> I've also renamed the module to test_spi_resowner to better reflect
> > >> what it's about.
> > >>
> > >> For the fix, I have a different proposal. As you observed, the
> > >> query_depth > 0 early return in FireAfterTriggerBatchCallbacks() means
> > >> that the nested SPI's callbacks get called under the outer resource
> > >> owner, which may not be the same as the one that SPI used. I think it
> > >> was a mistake to have that early return in the first place. Instead we
> > >> could remember for each callback what firing level it should be called
> > >> at, so the nested SPI's callbacks fire before returning to the parent
> > >> level and parent-level callbacks fire when the parent level completes.
> > >> I have implemented that in the attached 0002 along with transaction
> > >> boundary cleanup of callbacks, which passes the check-world for me,
> > >> but I'll need to stare some more at it before committing.
> > >>
> > >> Let me know if this also fixes your own in-house test suite or if you
> > >> have any other suggestions or if you think I am missing something.
> > >
> > > One more cleanup patch attached as 0003: afterTriggerFiringDepth was
> > > added by commit 5c54c3ed1 as a file-static variable, which in
> > > hindsight should have been a field in AfterTriggersData alongside the
> > > other per-transaction after-trigger state. This patch makes that
> > > correction.
> > >
> > > One alternative design worth considering for 0002: storing
> > > batch_callbacks per query level in AfterTriggersQueryData rather than
> > > as a single list in AfterTriggersData, so callbacks naturally live at
> > > the query level where they were registered and get cleaned up with
> > > AfterTriggerFreeQuery on abort. Deferred constraints still need a
> > > top-level list in AfterTriggersData since they fire outside any query
> > > level. FireAfterTriggerBatchCallbacks() takes a list parameter and the
> > > caller passes either the query-level or top-level list as appropriate.
> > > This eliminates the need for firing_depth-matched firing entirely. I
> > > did that in 0004.  I think I like it over 0002.  Will look more
> > > closely tomorrow morning.
> > A few comments on v3:
>
> Thanks for the review.
>
> > 1 - 0002
> > ```
> >  static void
> >  FireAfterTriggerBatchCallbacks(void)
> >  {
> > +       List       *remaining = NIL;
> > +       List       *to_fire = NIL;
> >         ListCell   *lc;
> >
> > -       if (afterTriggers.query_depth > 0)
> > -               return;
> > +       /* remaining and to_fire lists must survive until callbacks
> complete */
> > +       MemoryContext oldcxt =
> MemoryContextSwitchTo(TopTransactionContext);
> > ```
> >
> > I think remaining and to_fire should stay in the same context of
> afterTriggers.batch_callbacks, so instead of hard coding
> TopTransactionContext, we can use
> GetMemoryChunkContext(afterTriggers.batch_callbacks), which makes the
> intention explicit.
>
> I'm dropping 0002 or have merged 0004 into it so this memory context
> switch is no longer present.
>
> > 2 - 0004, I noticed one potential problem, although I am not sure
> whether it can really happen in practice. This version stores callback
> items at the individual query depth, and FireAfterTriggerBatchCallbacks()
> now iterates the callback list for that depth and invokes each callback
> directly. My concern is that if one of those callbacks needs to register a
> new callback, that would append a new item to the same list while it is
> being iterated. That seems unsafe to me, because list append may create a
> new list structure underneath. If that happens, we may end up modifying the
> list being traversed, which does not look safe.
> >
> > This problem doesn’t exist in 0002, because 0002 splits
> afterTriggers.batch_callbacks into remaining and to_fire, and reset
> afterTriggers.batch_callbacks = remaining before running callbacks. But the
> problem is, if a callback registers a new callback, the new callback goes
> to afterTriggers.batch_callbacks, so it won’t get executed.
> >
> > From this perspective, I would assume a callback should not be allowed
> to register a new callback. Can you please help confirm?
>
> Good point on the re-entrant registration concern. I've added a
> firing_batch_callbacks flag to AfterTriggersData that prevents
> callbacks from registering new callbacks during
> FireAfterTriggerBatchCallbacks(), with an Assert in
> RegisterAfterTriggerBatchCallback() to enforce it. That should keep
> the list being iterated from being modified.
>
> The attached patches are updated accordingly. 0001 is the main fix
> incorporating the per-query-level storage design, the transaction
> boundary cleanup, and the firing_batch_callbacks guard. 0002 is a
> followup that moves afterTriggerFiringDepth into AfterTriggersData as
> a minor cleanup of 5c54c3ed1b9. Barring further feedback I plan to
> commit 0001 and 0002 shortly. For 0003, I need to check on the policy
> around adding new test modules during feature freeze before committing
> it.
>
> --
> Thanks, Amit Langote
>


 Hi,

I took a glance at the patch, overall looks good to me. A nitpick on 0001:

+       bool            firing_batch_callbacks; /* true when in
+
     * FireAfterTriggersBatchCallbacks() */

Looks like a typo in the comment. The function name is
FireAfterTriggerBatchCallbacks, no “s” after Trigger.

Best regards,
--
wang jie


view thread (63+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
  In-Reply-To: <CAJnZyeDyaS=X-eYN=9rDYqK=6ma1gMLa0qDgfNbZKK0e0+q99Q@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox