Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vYJvL-000r8z-34 for pgsql-bugs@arkaria.postgresql.org; Wed, 24 Dec 2025 08:08:48 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vYJvK-003vKz-1g for pgsql-bugs@arkaria.postgresql.org; Wed, 24 Dec 2025 08:08:47 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vYJvK-003vKr-0O for pgsql-bugs@lists.postgresql.org; Wed, 24 Dec 2025 08:08:47 +0000 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vYJvJ-002Mhe-14 for pgsql-bugs@lists.postgresql.org; Wed, 24 Dec 2025 08:08:45 +0000 Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-34c21417781so5684032a91.3 for ; Wed, 24 Dec 2025 00:08:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766563724; x=1767168524; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=upNUfSeymAZaxwfLeMkz3sIKd42v4/Ws6+Uzd0Gq7Pk=; b=QtKwfk7ziquyI6wPG3LDRW2UIL7Qt8Y9hmN9htOifH2xRVK9AV8BKWz6qxYyR2zbFK HEz3xh9EdkWog8PTXCb09baa14nx8SIZm+wr0YetaDjiEswYNJFMgUvmhcKxbH55CQxm UcnIDa9YqkbrrYAfWOm0p7RhxPq98wr/9IFq/hNdvRohHU4JV6jQfrmSoALSCLH6j7qU XvENNgnEq8rF3zHTY3SKKq+wty2uZClEzHfRm9P3BWdYzeRyXS0qFmBhXPdkOXwgT+pY KkYvktyJzLYLllpGaUhSQnV5TCSBFbNIJ6taQBfFyFc5++SiAzS0r96eG85W0OZjYfZN fOpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766563724; x=1767168524; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=upNUfSeymAZaxwfLeMkz3sIKd42v4/Ws6+Uzd0Gq7Pk=; b=VUiQG/k5cyuYB5ayvKR+qz2H7n5aPlr2RyLivS7hfmCxxYmZmopVM93hovtzRDcsMV oRSW8nkf+YipypqQC8fAL13mgf2DIQcdqBbQKzMmO/Y8KqSxwKvU78y3oucgXEdpXt5E WgsGwmH+GxC3scvxbL+fs9YIfwIqOLzKJ14KTlaExuQ+OugHpcxyIVfSitLB+kTiN/qe 3dMmx1UbsqJTctHHy6xxD9dzNS7EwDVqBr7W6SmuHB9vWqEhiobJOKPAVzeeC6mbfSi3 urq52SYvBMG3FwWX2avfqSgvJ0ptay6PQe+6jsBOPF3bvpxQw72ekst/B6ZJVoi4Ov7k /ZSw== X-Forwarded-Encrypted: i=1; AJvYcCX6VHaP1PYZSdRICck7U+ejcKpdfvTVWqF6cNlycSMDWtJYld2rZ65bDtIIq5kMwgGIoRXfaT5VC2U6@lists.postgresql.org X-Gm-Message-State: AOJu0YxYJV041T+yc/BjCY0mBCuwuJJX1omb9gWE4fHFUbNRJFaTaVxy Ylvc8rt96HlLj2n4b6/4UxFDQwNUivaWDH0J+bqePm7EFfaflrCNB15Hmu5i7qwlcrbR7VjRJ4V +toBkr5CKYX24evLommfVA3ZEWD6HDTc= X-Gm-Gg: AY/fxX7eRFJ4GYDF3Fsabwm0IZpvDxNwqhROGKRYRuciW050IEpedGxsdUWur2ViqKM yBdaqmxFkpvMEWRZHOPnIdyHeK22QibtXo7CczZS7cMN4iEjl+oRjdRQ+f6NXvQIIVnWFAV/5WF G7j8whVkmW2gpg9gfvgXJNPglQE3SCrLkjQOzHzCdfPOKCuLqNM91HD3LNiWEdvsW60D116CNYj 9oem66fw5Hzai2SXCkbvTdpLTySBmNe1wThn/ZXD+mWub6JTkJhTd8CqomUh1sYhv8IVlHQ/Q== X-Google-Smtp-Source: AGHT+IGDh9NTLiTqO/gQAO03RvrPQjCnMgVH17Pvfb0vnHEqMErJPqMp4IpBtV0mltZM9CyUaJ5hk979lfrLU8g/CNw= X-Received: by 2002:a17:90b:3f90:b0:340:c261:f9f3 with SMTP id 98e67ed59e1d1-34e92130102mr15171071a91.14.1766563723935; Wed, 24 Dec 2025 00:08:43 -0800 (PST) MIME-Version: 1.0 References: <19355-57d7d52ea4980dc6@postgresql.org> <868ff2a518820c8864b6d28510294b2457a126af.camel@cybertec.at> In-Reply-To: From: Amit Langote Date: Wed, 24 Dec 2025 13:38:33 +0530 X-Gm-Features: AQt7F2oHnJMfNoKk1oDFGpv2L8TXnCxqjKO4BJ4E0Zlew0nBQd4z4x8rvl_0Mag Message-ID: Subject: Re: BUG #19355: Attempt to insert data unexpectedly during concurrent update To: Dean Rasheed Cc: Bh W , Laurenz Albe , pgsql-bugs@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000003d034c0646ae2bdc" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000003d034c0646ae2bdc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, On Tue, Dec 23, 2025 at 4:07 Dean Rasheed wrote: > On Mon, 22 Dec 2025 at 14:51, Bh W wrote: > > > > The issue is that the MERGE INTO match condition is not updated. > > In the MATCHED path of MERGE INTO, when the target row satisfies the > match condition and the condition itself has not changed, the system shou= ld > still be able to handle concurrent updates to the same target row by > relying on EvalPlanQual (EPQ) to refetch the latest version of the tuple, > and then proceed with the intended update. > > However, in the current implementation, even though the concurrent > update does not modify any columns relevant to the ON condition, the EPQ > recheck unexpectedly results in a match condition failure, causing the > update path that should remain MATCHED to be treated as NOT MATCHED. > > I spent a little time looking at this, and managed to reduce the > reproducer test case down to this: > > -- Setup > drop table if exists t1,t2; > create table t1(a int primary key, b int); > create table t2(a int, b int); > > insert into t1 values(1,0),(2,0); > insert into t2 values(1,1),(2,2); > > -- Session 1 > begin; > update t1 set b =3D b+1; > > -- Session 2 > merge into t1 using (values(1,1),(2,2)) as t3(a,b) on (t1.a =3D t3.a) > when matched then > update set b =3D t1.b + 1 > when not matched then > insert (a,b) values (1,1); > > -- Session 1 > commit; > > This works fine in PG17, but fails with a PK violation in PG18. > Git-bisecting points to this commit: > > cbc127917e04a978a788b8bc9d35a70244396d5b is the first bad commit > commit cbc127917e04a978a788b8bc9d35a70244396d5b > Author: Amit Langote > Date: Fri Feb 7 17:15:09 2025 +0900 > > Track unpruned relids to avoid processing pruned relations > > Doing a little more debugging, it looks like the problem might be this > change in InitPlan(): > > - /* ignore "parent" rowmarks; they are irrelevant at runtime *= / > - if (rc->isParent) > + /* > + * Ignore "parent" rowmarks, because they are irrelevant at > + * runtime. Also ignore the rowmarks belonging to child tabl= es > + * that have been pruned in ExecDoInitialPruning(). > + */ > + if (rc->isParent || > + !bms_is_member(rc->rti, estate->es_unpruned_relids)) > continue; > > which seems to cause it to incorrectly skip a rowmark, which I suspect > is what is causing EvalPlanQual() to return the wrong result. Thanks for the detailed analysis and adding me to the thread, Dean. I would think that a case that involves no partitioning at all would be untouchable by this code, but it looks like the logic I added is incorrectly affecting cases where pruning isn=E2=80=99t even relevant. I=E2= =80=99ll need to look more carefully at why such a rowmark would exist in the rowmarks list if its relation isn=E2=80=99t in es_unpruned_relids. Maybe the set populati= on is incorrect at some point, or perhaps it matters that the set is a copy in the EPQ estate. I=E2=80=99m afk (on vacation) at the moment, so won=E2=80=99t be able to di= g into this until next week. =E2=80=94 Amit > --0000000000003d034c0646ae2bdc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

On Tue, Dec 23, 2025 at 4:07 Dean Rasheed <= ;dean.a.rashe= ed@gmail.com> wrote:
On Mon, 22 Dec 2025 at 14:51, Bh W= <wangbihua.= cn@gmail.com> wrote:
>
> The issue is that the MERGE INTO match condition is not updated.
> In the MATCHED path of MERGE INTO, when the target row satisfies the m= atch condition and the condition itself has not changed, the system should = still be able to handle concurrent updates to the same target row by relyin= g on EvalPlanQual (EPQ) to refetch the latest version of the tuple, and the= n proceed with the intended update.
> However, in the current implementation, even though the concurrent upd= ate does not modify any columns relevant to the ON condition, the EPQ reche= ck unexpectedly results in a match condition failure, causing the update pa= th that should remain MATCHED to be treated as NOT MATCHED.

I spent a little time looking at this, and managed to reduce the
reproducer test case down to this:

-- Setup
drop table if exists t1,t2;
create table t1(a int primary key, b int);
create table t2(a int, b int);

insert into t1 values(1,0),(2,0);
insert into t2 values(1,1),(2,2);

-- Session 1
begin;
update t1 set b =3D b+1;

-- Session 2
merge into t1 using (values(1,1),(2,2)) as t3(a,b) on (t1.a =3D t3.a)
when matched then
=C2=A0 =C2=A0 =C2=A0 update set b =3D t1.b + 1
when not matched then
=C2=A0 =C2=A0 =C2=A0 insert (a,b) values (1,1);

-- Session 1
commit;

This works fine in PG17, but fails with a PK violation in PG18.
Git-bisecting points to this commit:

cbc127917e04a978a788b8bc9d35a70244396d5b is the first bad commit
commit cbc127917e04a978a788b8bc9d35a70244396d5b
Author: Amit Langote <amitlan@postgresql.org>
Date:=C2=A0 =C2=A0Fri Feb 7 17:15:09 2025 +0900

=C2=A0 =C2=A0 Track unpruned relids to avoid processing pruned relations
Doing a little more debugging, it looks like the problem might be this
change in InitPlan():

-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* ignore "parent" rowm= arks; they are irrelevant at runtime */
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (rc->isParent)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/*
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Ignore "parent" rowm= arks, because they are irrelevant at
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * runtime.=C2=A0 Also ignore the= rowmarks belonging to child tables
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * that have been pruned in ExecD= oInitialPruning().
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (rc->isParent ||
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0!bms_is_member(rc-&= gt;rti, estate->es_unpruned_relids))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 continue;

which seems to cause it to incorrectly skip a rowmark, which I suspect
is what is causing EvalPlanQual() to return the wrong result.
<= div dir=3D"auto">
Thanks for the detailed analysis and adding me to the thread, = Dean.

I would think that a case that involves no partitioning at al= l would be untouchable by this code, but it looks like the logic I added is= incorrectly affecting cases where pruning isn=E2=80=99t even relevant. I= =E2=80=99ll need to look more carefully at why such a rowmark would exist i= n the rowmarks list if its relation isn=E2=80=99t in es_unpruned_relids. Ma= ybe the set population is incorrect at some point, or perhaps it matters th= at the set is a copy in the EPQ estate.

I=E2=80=99m afk (on vacatio= n) at the moment, so won=E2=80=99t be able to dig into this until next week= .

=E2=80=94 Amit
--0000000000003d034c0646ae2bdc--