Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXE71-0038XS-2E for pgsql-hackers@arkaria.postgresql.org; Wed, 10 Jun 2026 08:16:36 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wXE70-00A1Ai-0j for pgsql-hackers@arkaria.postgresql.org; Wed, 10 Jun 2026 08:16:34 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXE6z-00A1AY-2m for pgsql-hackers@lists.postgresql.org; Wed, 10 Jun 2026 08:16:33 +0000 Received: from mail-dl1-x1234.google.com ([2607:f8b0:4864:20::1234]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wXE6x-00000002IU3-24wI for pgsql-hackers@postgresql.org; Wed, 10 Jun 2026 08:16:33 +0000 Received: by mail-dl1-x1234.google.com with SMTP id a92af1059eb24-137ec563a95so8106209c88.0 for ; Wed, 10 Jun 2026 01:16:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1781079388; cv=none; d=google.com; s=arc-20240605; b=LYJJuKOFG75s9bqnIjqEKMFG4h7eZXV6wvf9Mj8N1ihuxCSnjUftd4eowxWhSiAYBr 0019sYai+jfoqEUkAIouPHVicIlg3v9ISffh3sj6XE6qsdNGInb/uXCf2lq4DMPcfl03 hwuQLV5MSCM1Maem0N2A8+q7VEnof7wT4CfOKLckK+XY7DkAr7+TvByJSwX9DVl+fVht BYDLdijhACxQmMvFBRh0dPsfpwAVN4L76yjTU1Skjft228WHrSJKiThzukyONnztmV5h CF+utTX0Hs5sqg74axVyYLTGgzxibKiqRUiOODqDMNRPvyIb/Kqab/dr52PJLMuNQ+VQ 2Nug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=o8sQl1wn/t7G8fkto4KmwN0JcqI0v+jeCWfkctmMifU=; fh=/mJFQ6aK3JUBPrErf5+p9kiotrxvf9NN9n1hzoU3viI=; b=D/Fcphu/gTkRjoQD5XczNbltpzLuf8kUabQANIAIQqiXr98CROxKPBSDa+56z2ockR l53ESIQEh6oqyFjCc5h7ufTEVgg8rEy//VRgMeFiyF8yuXGzVithK7t0Zl5tQZKjGrAd o8L+vqL7AzAmg6m0QaWpKOcW6TyfsAllfQay5Hnt/HZuOSzfcU5XlsUt/zjq01LWF7Av FRiktshaV/Rix3wswKec1pYq9JciAKf1LFKkkmxy2FAlFZUOneppNQGzwElfbrH7FEOY ORRhI8fb+6ru9l40z/nW1ytG9NQh5DUJ15nYqeKcwS2dNElpYSZ1Oj0LUGOuC0hmvpwE I3YA==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=postgres.ai; s=google; t=1781079388; x=1781684188; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=o8sQl1wn/t7G8fkto4KmwN0JcqI0v+jeCWfkctmMifU=; b=D7PfTJRq7Z1nn7E2bPJvXKOxH1PE0fhr6lxujchjB3nY5Tncl0s5zenOMqvFZLMOce RhSHgtSbViORdLITE6wFo4ljvHGqS+SvoKORhCRUwLiqD00wwO3CaBQSktjTzTGbryVM nsju6kMV6i5dG1y20Fyl5mFpcGBjP0kiwqt9fju9pmdE7FpeNkABLzd7sW9RVy7Chj4m vs2jUIm2KDgha0qIpjfVqmOnFe+BdITAK0gSp4W0OTjH1qV1z/vRu3Dmi5v0znMAMtKB C+IDq9CVPqXQiR69/bG6YfEQTQgMVUl7hU1Pfj5tfgTtGO8S66lJSaNr6cg6fOdU02I7 jxsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781079388; x=1781684188; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=o8sQl1wn/t7G8fkto4KmwN0JcqI0v+jeCWfkctmMifU=; b=FvlidU6G9p72UcDp9mvx6BT6KPr2BfMeCfdcQl+LCsz71k/eTP5r+S6HjTUJ/igj52 qhsBtF/QSimQotMAFmOtyZQbRTpnFFSauigshM0bJpmQPgyewGI8SqOogg2/1/k4viET /64NaSowObrZ6ing+B18LvuBZcIV0bmXQ1DaPyt+XImXFKyBl7u9MPATLnyuLQ/OqWgQ 9OgJrgAXVggqJS+Qz/tRoS0sf2x/XoX5vzQNZkLXaapTguldcMb3xnYu/N+5UtW7mCCp UoHocfoAjJ/ytZkXAYy6htpAxLWCL12qX8tbH/xze0VRk9VDyZziRvZPYcSvTYzAOhI9 zeVg== X-Gm-Message-State: AOJu0YwJjLDP03UQUb7AQSytQzhcj6szWcXYdxvNXbiWWFUPPwZG2kyM p1FKC7glyLVbuJUhItGufwwrbUKbDGI59L6PzkAu2n4pLzC9tAMA2/hgaNNxtGLHyVta8qiS20q Dv3xB1E2nEKB2zIOqB3z8fyyLBv8yDPx5hY46lIhsBA== X-Gm-Gg: Acq92OFGWBHTOIGFuyxvRLtS5UHPxQ199sAgpRknoGqWQVxeIUadI7dFR8UvR7p0+We f16KBg2Q9aQODJrUjkq4Mfss33qK09JX83y6tLda/Vb98c5gkkcUrBGuVl07jzGQowvyydJR/d9 roKF/Jy/E/6bHmG643ZuhE+5GsDCDDYRijIJrs4tW0zh0BzQLS9aeIb1OxREody4kl4E1E4NNyo O+WtwxQfrDAR5XrMzKjrtxDfPoV5LcvLo8XHdTHWO/E+4bwTECMdyFQhXEbqqhZdm8jRru7uERq seItXCJACw4JI8f96A== X-Received: by 2002:a05:7023:b81:b0:135:f5ed:868 with SMTP id a92af1059eb24-13818f937a5mr7581089c88.17.1781079388237; Wed, 10 Jun 2026 01:16:28 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Nikolay Samokhvalov Date: Wed, 10 Jun 2026 01:16:16 -0700 X-Gm-Features: AVVi8Cd9ZT5zilO691i2WtWoeR5YoMpPbsajpDqYuJwZI5cI2bPR9YaykrJKvCM Message-ID: Subject: Re: PG19 FK fast path: OOB write and missed FK checks during batched To: Amit Langote Cc: pgsql-hackers mailing list , Andrey Borodin , Kirk Wolak Content-Type: multipart/alternative; boundary="00000000000040c6190653e1dc64" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000040c6190653e1dc64 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jun 9, 2026 at 6:31=E2=80=AFAM Amit Langote wrote: > On Mon, Jun 8, 2026 at 5:18=E2=80=AFPM Amit Langote > wrote: > > On Sat, Jun 6, 2026 at 6:13=E2=80=AFPM Amit Langote > wrote: > > > Thanks for the detailed report and reproducers. I=E2=80=99ve started = looking > into this. > > > > Continuing to look. Appended this to the open items list: > > > > https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items#Open_Issues > > Thanks again, Nik, for the thorough analysis and the reproducers -- > they made all three easy to confirm and pin down. Patches attached: > 0001 for defect 1, 0002 for defects 2 and 3. > > 0001 (defect 1): check and flush before writing the row rather than > after, and add a per-entry "flushing" flag so a re-entrant add on the > same entry during a flush takes the per-row path instead of touching > the mid-flush batch. The flag is cleared in a PG_FINALLY, which also > resets batch_count, so the entry stays reusable if a flush error is > caught by a savepoint. > > 0002 (defects 2 and 3): rather than track subxact membership per row, > confine batching to the top transaction level -- in RI_FKey_check, > when GetCurrentTransactionNestLevel() > 1, use the per-row path. I > went this way because per-entry subxact tracking isn't enough (one > entry's batch can mix rows from several levels, since the cache is > keyed by constraint), and flushing at subxact boundaries doesn't work > for deferred constraints. Once the cache only ever holds top-level > rows, a subxact abort has nothing of its own to discard, so > ri_FastPathSubXactCallback goes away -- that's what fixes your defect > 2 reproducer. For defect 3, which is still reachable at the top level, > the same patch adds a cache-wide flag set while ri_FastPathEndBatch > iterates, so a re-entrant check during the scan takes the per-row path > instead of inserting into the cache being scanned. > > The per-row path still bypasses SPI, so these stay well ahead of the > pre-19 check in terms of performance. I'd like to recover batching > across subtransactions properly in v20 but didn't want to rush it now. > > On defect 3, can you check whether your reproducer still commits the > orphan with 0002 applied, or whether (like on my build) it now raises > the violation? I'd like to be sure the bucket-placement variation you > hit is actually covered. And of course any review of the patches is > welcome. > > -- > Thanks, Amit Langote > Hi Amit, Thanks for the quick fixes. I checked v1-0001 + v1-0002 against current master (e18b0cb7) with an assertion/debug build. - Both apply cleanly to master (in sequence) - Defect 1 same-FK re-entry no longer crashes; the original shape completes and leaves the expected rows - Defect 2 subtransaction-abort case now raises the FK violation instead of committing orphans - For your defect 3 question: with 0002 applied, my reproducer no longer commits the child2 orphan. It raises: ERROR: insert or update on table "child2" violates foreign key constraint "child2_fkey" DETAIL: Key (a)=3D(999999) is not present in table "parent". After the error, child2_orphans =3D 0 and child2 is empty in my run. I also ran the regression suite in that tree; foreign_key passed, and the full run reported all 245 tests passed. So v1 looks good to me for the three reported cases. Thanks! Nik --00000000000040c6190653e1dc64 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Jun 9, 2026 at 6:31=E2=80=AF= AM Amit Langote <amitlangote0= 9@gmail.com> wrote:
On Mon, Jun 8= , 2026 at 5:18=E2=80=AFPM Amit Langote <amitlangote09@gmail.com> wrote:
> On Sat, Jun 6, 2026 at 6:13=E2=80=AFPM Amit Langote <amitlangote09@gmail.com&= gt; wrote:
> > Thanks for the detailed report and reproducers. I=E2=80=99ve star= ted looking into this.
>
> Continuing to look.=C2=A0 Appended this to the open items list:
>
> https://wiki.postgresql.or= g/wiki/PostgreSQL_19_Open_Items#Open_Issues

Thanks again, Nik, for the thorough analysis and the reproducers --
they made all three easy to confirm and pin down. Patches attached:
0001 for defect 1, 0002 for defects 2 and 3.

0001 (defect 1): check and flush before writing the row rather than
after, and add a per-entry "flushing" flag so a re-entrant add on= the
same entry during a flush takes the per-row path instead of touching
the mid-flush batch. The flag is cleared in a PG_FINALLY, which also
resets batch_count, so the entry stays reusable if a flush error is
caught by a savepoint.

0002 (defects 2 and 3): rather than track subxact membership per row,
confine batching to the top transaction level -- in RI_FKey_check,
when GetCurrentTransactionNestLevel() > 1, use the per-row path. I
went this way because per-entry subxact tracking isn't enough (one
entry's batch can mix rows from several levels, since the cache is
keyed by constraint), and flushing at subxact boundaries doesn't work for deferred constraints. Once the cache only ever holds top-level
rows, a subxact abort has nothing of its own to discard, so
ri_FastPathSubXactCallback goes away -- that's what fixes your defect 2 reproducer. For defect 3, which is still reachable at the top level,
the same patch adds a cache-wide flag set while ri_FastPathEndBatch
iterates, so a re-entrant check during the scan takes the per-row path
instead of inserting into the cache being scanned.

The per-row path still bypasses SPI, so these stay well ahead of the
pre-19 check in terms of performance. I'd like to recover batching
across subtransactions properly in v20 but didn't want to rush it now.<= br>
On defect 3, can you check whether your reproducer still commits the
orphan with 0002 applied, or whether (like on my build) it now raises
the violation? I'd like to be sure the bucket-placement variation you hit is actually covered. And of course any review of the patches is
welcome.

--
Thanks, Amit Langote

Hi Amit,

Thanks = for the quick fixes.
=
I checked v1-000= 1 + v1-0002 against current master (e18b0cb7) with an assertion/debug build= .

- Both apply cleanly to master (in = sequence)
- Defect 1 same-FK re-entry no longer crashes; the original sh= ape completes and leaves the expected rows
- Defect 2 subtransaction-abo= rt case now raises the FK violation instead of committing orphans
- For = your defect 3 question: with 0002 applied, my reproducer no longer commits = the child2 orphan. It raises:
=C2=A0 =C2=A0 ERROR: insert or update on t= able "child2" violates foreign key constraint "child2_fkey&q= uot;
=C2=A0 =C2=A0 DETAIL: Key (a)=3D(999999) is not present in table &q= uot;parent".

After the error, child2_orphans =3D 0 and child2 i= s empty in my run.

I also ran the regression suite in that tree; for= eign_key passed, and the full run reported all 245 tests passed.

So = v1 looks good to me for the three reported cases.

Thanks!

Nik
--00000000000040c6190653e1dc64--