Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vT0ZW-00BEzE-29 for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Dec 2025 16:28:18 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vT0ZV-005W2m-0n for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Dec 2025 16:28:17 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vT0ZU-005W2d-36 for pgsql-hackers@lists.postgresql.org; Tue, 09 Dec 2025 16:28:17 +0000 Received: from mail-qv1-xf33.google.com ([2607:f8b0:4864:20::f33]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vT0ZT-003xUS-1a for pgsql-hackers@lists.postgresql.org; Tue, 09 Dec 2025 16:28:16 +0000 Received: by mail-qv1-xf33.google.com with SMTP id 6a1803df08f44-8805d8ea80cso4540016d6.3 for ; Tue, 09 Dec 2025 08:28:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765297694; x=1765902494; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KVz1TYF3u8IDsaGV47J+MmzNABOqJyQNn36dz3Bdugo=; b=VFajEJ6ivqjk73am44l7ZtJ+nr+9M54EnhGRc6Vcc08KY+UUkVXAeGaZIp2imTkQlE MgVpV3WVDIU7an0YhePC/2luTPgUWmLpU65Wlrst9EcBo49WlQXmSb/Edlv4lXRPsqSC v/SKnNTESMhbvB6fW16h7fNLOsBsqVv8+kfORZRNMcDi7pKwU9ewicGp0xOtcbS1lWNr +Q9ZLf+x4ZIo0JQP8XXq3jumZwcs5Z+RpCYNRpa1vB7haeVVqrVNCG1skMDN/BGaz3yP L1u6U1XtCYubUVi/9tuoRO7FapJOxI7DGqgSw3p0DTIZHgzctRUJJvUdxJGOADU4BBs5 Yz8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765297694; x=1765902494; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KVz1TYF3u8IDsaGV47J+MmzNABOqJyQNn36dz3Bdugo=; b=BBoAHN2OU1Fq/9ypHqP1Ej9x1Kr7KZN7OWvCYLReVe/dZ6jB4WQtRakOt6Kl2w2uId Dhab3UcGAfev8H1P6MvopXysdTXMHI8wUAmhkGuy3b95qnXbvPHTJ8q83xWwLXl13rL/ VgY3YMoBm9o52YE53UGElW2gbl6DyeAQ86pV4zRSqEJ9fOCf9uUduy2dAP+Xd0utSeoc sHZp5WZ3HUirZbu04gBfkU09lL0UkaZzXWYXuKZZUtNlJxc3k4kVuF4N63tdCw9K6mU6 7Dy4rXLw4V0vEtm5t6F2WKrMFvqYHBsze/OjTng4lbYr0txHzj/kZxRUrs9VVFE3YNHR J3iw== X-Gm-Message-State: AOJu0YzyRl+6Q11pqivseDP5gU0k1NvpYJLf+ZzgjWsSPpIvffENBzcl AFgMQ/mHPcsbyP7m4wxpuBnPkP4zJf7f447fLjvtPVZUCMBhLXKA0CRf3NuSoAndvi6LbbmnQwS iQTnyV43WU8VoPIW4JwZ2BERUNFn4xc8= X-Gm-Gg: AY/fxX5SqmBhXPIBxDFdqvi7rmXtrCn5E4MWXtdynDdX53mFz5kjwk5U5Zr64iyuFW2 h7ZNX8Wn8Cvq8OYXrZocy4YmhUhcQ6uuWn/MtOxGr7mjXsNWyaZ4R7nMkg+iCPRz6ALoU39n8tC thjA7frMGx7pMEDRJzeeKgxrpkhrBH4vw9JycRvAncveT9afz6+XtS1ahwVsh+t99PRw6OHBcrS QVU2ugkHvnp+vsEcC+Bqa7wUj1BUYdb0e4LsT/OgQiJMKQzOFvx7HqOt+3UW07s8CValoE= X-Google-Smtp-Source: AGHT+IHBTBLCoN902axQUN2IgNrOQ1GDSMYn7R+hy/Qd/qM/TxZFlNA7LhQNMAaS4OJfoVowOZv4FLf8jxsfNhd9jJs= X-Received: by 2002:ad4:4ea8:0:b0:880:5409:cb7c with SMTP id 6a1803df08f44-8883d8942f3mr149552706d6.0.1765297693949; Tue, 09 Dec 2025 08:28:13 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Adam Brusselback Date: Tue, 9 Dec 2025 11:27:58 -0500 X-Gm-Features: AQt7F2pHpoSx4KBsygekZooHX0t0fXahBw7sqYNXKwWvkvJovq0YgvgbqdINrqM Message-ID: Subject: Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW To: Kirk Wolak Cc: PostgreSQL Hackers Content-Type: multipart/alternative; boundary="000000000000f881e90645876579" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f881e90645876579 Content-Type: text/plain; charset="UTF-8" > Our situation was a wonderful MV with all the columns we needed (some > hard to calculate) to augment search data done millions of times/day. It > was a thing of beauty. Until we realized we needed to update 1 record > (vendor inventory UPDATE date/time) every time we processed a file > (something we do 24x7, a hundred times each hour! > > For that ONE field, we ended up doing REFRESH MV concurrently; OVER > 2,000 times per day. Thanks for the feedback and the use case, Kirk. Regarding that specific scenario where a single column ("last updated" or similar) churns significantly faster than the heavy-computation columns: Even with this patch, you might find it beneficial to separate that high-velocity column into its own small materialized view (or regular view) and join it to the main MV at query time. That will reduce the bloat you get on the main MV by quite a lot, especially if you have very wide rows (which it seems like you do). I initially tried to implement logic that would allow for direct UPDATEs (which would enable HOT updates). However, to handle rows that matched the predicate but were no longer present in the new source data, I had to run an anti-join to identify them for deletion. That approach caused performance issues, so I settled on the "Prune + Upsert" strategy (DELETE matching rows, then INSERT from source). Because this patch performs a delete/insert cycle, updating that one timestamp column will still result in rewriting the whole tuple in the MV. > For that ONE field, we ended up doing REFRESH MV concurrently; OVER > 2,000 times per day. That said, 2,000 refreshes per day is nothing for this implementation, provided your updates are selective enough and your queries allow for predicate push-down to the base tables. I look forward to your thoughts after reviewing the code. Thanks, Adam Brusselback --000000000000f881e90645876579 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
> =C2=A0 Our situation was a wonderful MV with all the = columns we needed (some
> hard to calculate) to augment search data d= one millions of times/day.=C2=A0 It
> was a thing of beauty.=C2=A0 Un= til we realized we needed to update 1 record
> (vendor inventory UPDA= TE date/time) every time we processed a file
> (something we do 24x7,= a hundred times each hour!
>
> =C2=A0 For that ONE field, we e= nded up doing REFRESH MV concurrently; =C2=A0OVER
> 2,000 times per d= ay.

Thanks for the feedback and the use case, Kirk.

Regarding= that specific scenario where a single column ("last updated" or = similar) churns significantly faster than the heavy-computation columns:Even with this patch, you might find it beneficial to separate that high-v= elocity column into its own small materialized view (or regular view) and j= oin it to the main MV at query time. That will reduce the bloat you get on = the main MV by quite a lot, especially if you have very wide rows (which it= seems like you do).

I initially tried to implement logic that would= allow for direct UPDATEs (which would enable HOT updates). However, to han= dle rows that matched the predicate but were no longer present in the new s= ource data, I had to run an anti-join to identify them for deletion. That a= pproach caused performance issues, so I settled on the "Prune + Upsert= " strategy (DELETE matching rows, then INSERT from source).

Bec= ause this patch performs a delete/insert cycle, updating that one timestamp= column will still result in rewriting the whole tuple in the MV.

&g= t; =C2=A0 For that ONE field, we ended up doing REFRESH MV concurrently; = =C2=A0OVER
> 2,000 times per day.

That said, 2,000 refreshes p= er day is nothing for this implementation, provided your updates are select= ive enough and your queries allow for predicate push-down to the base table= s.

I look forward to your thoughts after reviewing the code.

= Thanks,
Adam Brusselback
--000000000000f881e90645876579--