Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wRwuV-002pfb-0B for pgsql-hackers@arkaria.postgresql.org; Tue, 26 May 2026 18:53:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wRwuS-005ezc-1r for pgsql-hackers@arkaria.postgresql.org; Tue, 26 May 2026 18:53:49 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wRwuS-005ezU-0Z for pgsql-hackers@lists.postgresql.org; Tue, 26 May 2026 18:53:49 +0000 Received: from mail-qv1-xf2f.google.com ([2607:f8b0:4864:20::f2f]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wRwuQ-00000001ZDk-3dMW for pgsql-hackers@lists.postgresql.org; Tue, 26 May 2026 18:53:48 +0000 Received: by mail-qv1-xf2f.google.com with SMTP id 6a1803df08f44-8b446bba9c7so9641076d6.2 for ; Tue, 26 May 2026 11:53:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1779821625; cv=none; d=google.com; s=arc-20240605; b=bkRvcPOqDUsDdGfjlycEd4cgcH9OPrvIbG6QAURyuUp7+95Ao6i2bM/yZfDDdXC2Vu hIOd5qUirESZ68JfgVr22GjMiRK+i3rPxY/12lmzjbkHG2qbM82NTrBUAJEP06o/awp0 1j81TeARgVuJHNHZskxAUt0KAonf92XGIdonhN0BsczE9klNuf4kuNaOc/hiHbq95LzE 9v/PRGihwzC0r4pN8sj8Oyot/UScJdk30ImDb7h4TlJ0c2bucnHdOlctFnP9ZVz8vsjs OOgMZUWAE0EN0Yf8X+O4jZ9n53M8ZTfHegKR21XoRofqRN+gIq/2UQqb7Hy2/ysqGg/2 PHsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=6174d9goaqDp5QPONGRvl5bIjs0s/E+2KJ2TejYMLNs=; fh=PaVvtMb47A3qJKwFjOD7g653J9sTkruYTS6lN+JKhWo=; b=RxqTyTsvGSSWozKvbiUb/BHbmMekI0cSy6B/jcUBiCO4PszjUwgsI4ybowu/WCwuoK q8HrH8X1DCYqVBYNZhweVPksMy+zXlbE/userACL0TX2gY2U86afcpUVrX9Wyr5w9hJH oLZZ6T5wdNEbGoM3JjzAaZQHruigHjiBArdGRgciC2ol8rvMTfCQWFF6/aT0HZa2cfSJ akqcF6ARzwAH4cx8nLSvA+1xg2G7/MWllaBugueD5653QQ80OrFgDvWTlGw1tXdeknMZ xDhAEMaQkUj7CcCwyjPirGD8cEZ8tZOHazs5S2J+p/rxHpWSqwi3EEjHDdGs/zOXW6bv 2p5w==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779821625; x=1780426425; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=6174d9goaqDp5QPONGRvl5bIjs0s/E+2KJ2TejYMLNs=; b=F1O91IKQrbsIlUsfLNMXNX1xkXzwrVtmeXW1Lk39irwmAHVpOWsho8iewH89BgfdZE /dPa12aOLGBYCr191GCKc1YY5uTKfeyQ8xVWnfWX/JNed4U3wCq+hGrM2QEfy/pnIpgs rtEql8Fj2w7NoGexM4l0dEUj46CKBQEECtP8EjheYP2N/NUvZUpZcWiMp0qMka041tqb vtBnbTz1tiIpmZWzug9veWKYzG3qoog4wDYM4DhxUT8Fzucn/07olK+CLUX/KDkPEJCT p5smLsHH1Euc+bYwJAPxGLOKFqNg9fQURZHenODjgBgvi+ahDDeFOSj6dhFbrrp7ozBw 7IRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779821625; x=1780426425; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=6174d9goaqDp5QPONGRvl5bIjs0s/E+2KJ2TejYMLNs=; b=iRbOf8hnHSOcc9KNAHt9HPuwCYD0M38Lt0UBqeVUyEb3CJ9DADvBcEj6aQlCV8/gOk HZdXCMemSad+moztA35yM06AXzgt+9izhlSLl5hDr6iq5w3sCuEceTO9GWP9yg0Jnsfh T7h7+7sqp9YQXq+Flctg0Xn2+ex7JthzJtLibnmjDeyzvvkYQYZyzkVoTr3s3B4gHzi8 uMV6UWjM+Elw+avAbD7YDVJfbotKFIIhRXLu4p7P6WFBctwP8LDdZuKmI6x/i04wjibH C9SPdpUGnEpzyQC2gJxBW4uQNlUVHFjyiaUSw0qxB4U5ZLlKDWPUFcbsBsm1YT3iKxsC lamQ== X-Gm-Message-State: AOJu0Yw5e7rpH2Ns131BEyx909kmgQY3zSN/T+Taviyj9BiVCL2yiG56 NUs1MK3ukN3CVdRgraUMDCkcalfcx9KDKVauOHA8S+cUrj8a3ISIu935twZD3WmUN4iRE8yCEqb tgX85b9ku7tom/RpjhdXfFNHl1dtGypSguqIq X-Gm-Gg: Acq92OH6iWMCr8soYklWfUSrltauyfoGNNQbD0xKsLPzzGtpqb9eqiXUlYTw914/+YQ VfccRpQQXCq2q0FSVzHq6wNCs4OMOmuEMjaT0fHA9qD0/A4mFshSjrbxDUsKP6kXIdlFr/OD65J sYG9eBMFEbp39uCZmIkThGm67O6HR8fhaxRmYyr6bwrxHw/KeUwY8xlPosVrIC/sitdIwulCxzx OxzmclvEgXJDHeKcAxpo47nF4kuXi95IylT/qX3Fh2IiZ37jaYEKkj1aNyj3IP+6BSquS3F7pRP 5+ZdhTpcSXyNqTlIpA== X-Received: by 2002:a05:6214:800a:b0:8ac:ac2f:c8b4 with SMTP id 6a1803df08f44-8cc7b3ba086mr221403776d6.0.1779821624550; Tue, 26 May 2026 11:53:44 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Adam Brusselback Date: Tue, 26 May 2026 14:53:28 -0400 X-Gm-Features: AVHnY4LvL4_hsB3Ox_Ihb7CnyJwN023qWRjCSgsdnz5qLJNZ0OlPl9xvAYdu49s Message-ID: Subject: Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW To: vellaipandiyan sm Cc: PostgreSQL Hackers Content-Type: multipart/alternative; boundary="000000000000b1e8130652bd0334" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000b1e8130652bd0334 Content-Type: text/plain; charset="UTF-8" Hi Vellaipandiyan, thanks for the review. > I wonder whether overlapping refreshes could still encounter deadlock > scenarios around UPSERT conflicts. That was definitely a gap. The FOR UPDATE step previously issued: SELECT 1 FROM mv WHERE (...) FOR UPDATE with no ORDER BY, so two overlapping refreshes could lock the existing rows in different physical orders and deadlock before either one reached the next statement. To fix this, the next patch gives the locking SELECT a deterministic ORDER BY on the unique key columns so every refresh acquires row locks in the same sequence. The upsert now feeds from an ordered source as well. > The CONCURRENTLY behavior also feels somewhat unintuitive here. I'll include this change in the next patch. After more thinking about it (and some feedback), I decided that going this way seems better. So any prior discussion about CONCURRENTLY vs without is now talking about the inverse of the current version of the patch. Additionally, I've fixed a "scope drift" issue where the two different implementations had inconsistent behavior. A row's non-key column changes in the base table, shifting it into the WHERE predicate's scope. Previously, this caused a unique constraint violation for the match_merge path because the old row was invisible to the filter and never deleted. This was previously handled by the direct_mod path properly. Now, both the direct-modification and match/merge paths resolve this by using an INSERT ... ON CONFLICT DO UPDATE step (or DO NOTHING if there are no non-key columns) against the arbiter index. This safely updates the out-of-scope stale row in place without duplicating the key. > It may also help to document the intended guarantees around overlapping > partial refreshes and concurrent DML on base tables. Here is my attempt at saying what the implementation actually guarantees; I will fit this into the docs in some way. If anyone notices something wrong with the below, please speak up. Concurrent Partial Refresh (direct_mod path): The FOR UPDATE / upsert-CTE path takes RowExclusiveLock. Readers are never blocked. Neither RowExclusiveLock nor the per-row FOR UPDATE locks conflict with a plain SELECT. The refresh issues two statements in one transaction: the locking SELECT, then the select/upsert/ anti-join-delete CTE. Under the default READ COMMITTED, these take separate snapshots, but the FOR UPDATE row locks are held to transaction end, so they bridge both statements. An overlapping refresh blocks on those locks for the full duration of the refreshing transaction, not just for the locking SELECT, so the gap between the two statements does not expose existing MV rows. The CTE's upsert and delete are a single statement, so a reader sees either the pre-refresh or post-refresh state of an affected row, never a half-applied one. Concurrent partial refreshes whose predicates touch overlapping existing MV rows are serialized. The second waits on the first's FOR UPDATE locks. Refreshes over disjoint row sets run in parallel. Within the predicate scope, the MV is made consistent with the query snapshot: rows present in the snapshot are upserted, rows that no longer appear are deleted via the anti-join. Rows whose key falls outside the predicate are not touched. Note this is keyed on the unique index, not on the predicate. As mentioned with the drift fix, if the predicate matches a fresh row whose key collides with an existing MV row that does not currently match the predicate, ON CONFLICT will update that existing row, and the step-one FOR UPDATE will not have locked it. This only arises when the predicate references non-key columns. When the predicate ranges over the unique-key columns, the colliding row necessarily matches too. It does not lock the base tables. Base-table DML committed after the refresh's snapshot is not reflected, identical to a normal full REFRESH. SELECT FOR UPDATE only serializes overlapping refreshes covering rows that already exist in the MV. Two refreshes that both insert the same new logical key are serialized by ON CONFLICT and the unique index, not by FOR UPDATE. The outcome is still correct. The last writer wins on that key. The predicate must be non-volatile (enforced) and a usable unique index is required (enforced). The above assumes READ COMMITTED. I haven't thought through how things will work with other isolation levels. Non-Concurrent Partial Refresh (match_merge path): The WHERE diff/merge path takes an ExclusiveLock and operates similarly to the standard full-refresh diff/merge, but with the diff scope restricted to rows matching the predicate. Readers are allowed, writers are blocked, and overlapping refreshes are serialized at the table level. The main difference from a full concurrent refresh is that its final insert uses ON CONFLICT DO UPDATE (just like the direct_mod path) specifically to resolve unique key violations caused by rows drifting into the predicate's scope. I will provide a new patch shortly. Thanks, Adam Brusselback --000000000000b1e8130652bd0334 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Vellaipandiyan, thanks for the review.

> I wo= nder whether overlapping refreshes could still encounter deadlock
> s= cenarios around UPSERT conflicts.

That was definitely a gap. The FOR= UPDATE step previously issued:

=C2=A0 =C2=A0 SELECT 1 FROM mv WHERE= (...) FOR UPDATE

with no ORDER BY, so two overlapping refreshes cou= ld lock the existing
rows in different physical orders and deadlock befo= re either one reached
the next statement.

To fix this, the next p= atch gives the locking SELECT a deterministic
ORDER BY on the unique key= columns so every refresh acquires row locks
in the same sequence. The u= psert now feeds from an ordered source as
well.

> The CONCURRE= NTLY behavior also feels somewhat unintuitive here.

I'll include= this change in the next patch. After more thinking about
it (and s= ome feedback), I decided that going this way seems better. So any
prior discussion about CONCURRENTLY vs without is now talking about the
inverse of the current version of the patch.

Addition= ally, I've fixed a "scope drift" issue where the two differen= t implementations
had inconsistent behavior. A row's non-key= =C2=A0column changes in the base table, shifting it=C2=A0
into th= e WHERE=C2=A0predicate's scope. Previously, this caused a unique constr= aint
violation=C2=A0for the match_merge path because the old row = was invisible to the filter
and never deleted. This was previously han= dled by the direct_mod path properly.

Now, both the direct-modifica= tion and match/merge paths resolve this by
using an INSERT ... ON CONFLI= CT DO UPDATE step (or DO NOTHING if there
are no non-key columns) agains= t the arbiter index. This safely updates
the out-of-scope stale row in p= lace without duplicating the key.

> It may also help to document = the intended guarantees around overlapping
> partial refreshes and co= ncurrent DML on base tables.

Here is my attempt at saying what the i= mplementation actually guarantees;
I will fit this into the docs in some= way. If anyone notices something
wrong with the below, please speak up.=

Concurrent Partial Refresh (direct_mod path):
The FOR UPDATE / u= psert-CTE path takes RowExclusiveLock. Readers are
never blocked. Neithe= r RowExclusiveLock nor the per-row FOR UPDATE
locks conflict with a plai= n SELECT. The refresh issues two statements
in one transaction: the lock= ing SELECT, then the select/upsert/
anti-join-delete CTE. Under the defa= ult READ COMMITTED, these take
separate snapshots, but the FOR UPDATE ro= w locks are held to transaction
end, so they bridge both statements. An = overlapping refresh blocks on
those locks for the full duration of the r= efreshing transaction, not
just for the locking SELECT, so the gap betwe= en the two statements does
not expose existing MV rows. The CTE's up= sert and delete are a single
statement, so a reader sees either the pre-= refresh or post-refresh
state of an affected row, never a half-applied o= ne.

Concurrent partial refreshes whose predicates touch overlapping = existing
MV rows are serialized. The second waits on the first's FOR= UPDATE
locks. Refreshes over disjoint row sets run in parallel. Within = the
predicate scope, the MV is made consistent with the query snapshot: = rows
present in the snapshot are upserted, rows that no longer appear ar= e
deleted via the anti-join.

Rows whose key falls outside the pre= dicate are not touched. Note this
is keyed on the unique index, not on t= he predicate. As mentioned with
the drift fix, if the predicate matches = a fresh row whose key collides
with an existing MV row that does not cur= rently match the predicate,
ON CONFLICT will update that existing row, a= nd the step-one FOR UPDATE
will not have locked it. This only arises whe= n the predicate references
non-key columns. When the predicate ranges ov= er the unique-key columns,
the colliding row necessarily matches too.
It does not lock the base tables. Base-table DML committed after therefresh's snapshot is not reflected, identical to a normal full REFRE= SH.

SELECT FOR UPDATE only serializes overlapping refreshes covering= rows
that already exist in the MV. Two refreshes that both insert the s= ame
new logical key are serialized by ON CONFLICT and the unique index, = not
by FOR UPDATE. The outcome is still correct. The last writer wins on=
that key.

The predicate must be non-volatile (enforced) and a us= able unique index
is required (enforced).

The above assumes READ = COMMITTED. I haven't thought through how things
will work with other= isolation levels.

Non-Concurrent Partial Refresh (match_merge path)= :
The WHERE diff/merge path takes an ExclusiveLock and operates similarl= y
to the standard full-refresh diff/merge, but with the diff scope
re= stricted to rows matching the predicate. Readers are allowed, writers
ar= e blocked, and overlapping refreshes are serialized at the table
level. = The main difference from a full concurrent refresh is that its
final ins= ert uses ON CONFLICT DO UPDATE (just like the direct_mod path)
specifica= lly to resolve unique key violations caused by rows drifting
into the pr= edicate's scope.

I will provide a new patch shortly.

Than= ks,
Adam Brusselback
--000000000000b1e8130652bd0334--