Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1snh5R-005Qa0-Dj for pgsql-general@arkaria.postgresql.org; Mon, 09 Sep 2024 16:17:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1snh5Q-00EnOe-NI for pgsql-general@arkaria.postgresql.org; Mon, 09 Sep 2024 16:17:56 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1snh5P-00EnOJ-7U for pgsql-general@lists.postgresql.org; Mon, 09 Sep 2024 16:17:56 +0000 Received: from fhigh5-smtp.messagingengine.com ([103.168.172.156]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1snh5L-000KKP-OQ for pgsql-general@postgresql.org; Mon, 09 Sep 2024 16:17:54 +0000 Received: from phl-compute-12.internal (phl-compute-12.phl.internal [10.202.2.52]) by mailfhigh.phl.internal (Postfix) with ESMTP id 2E73E11401E5; Mon, 9 Sep 2024 12:17:50 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-12.internal (MEProxy); Mon, 09 Sep 2024 12:17:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aklaver.com; h= cc:content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm3; t=1725898670; x=1725985070; bh=uiaJ+Nc1vOo5+s6B2oO1421GmoD1bxXLhkaukl4BQxs=; b= r0yDOS9I5QvR9d8jXDWfdXGecQstdgNVq8H3XHG50A1mKYpbcDclIB86l6xbwbRS W3fPfd5AA7Hs4ETZxnGEKYW+aIVIuRyocCM7cU+tD86zFqHU7JTifRhLg3QXQS8I fLJJs6dgQUd6qzmk5w71diS1dgQVlcaDZKiqqGetrqRT7ZfS3fZDWhNOh+15Fefs YQm+5crWguYeqPrfTsOQv377B3Zro523DWFmCcaLuYWfYDykqUYgVgbmbNRNVrwp AmrKvgMWSEfd2BIJsGnzBW5TKyOBKGOM1Brpd7MOBRnBwaFceW0geWKssPwIK9ZG ujQrlZyzn7Nijvo2BeOG/w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1725898670; x= 1725985070; bh=uiaJ+Nc1vOo5+s6B2oO1421GmoD1bxXLhkaukl4BQxs=; b=W 012W6qpIWAp/f//sRt40vzq8lF2HoxnkGk/EClyfcxlJ83GjCt+o7eUqgmbFqUAo 1rcZKSRdwBkc+0Oqs9mWs1tbJjfXarvXpm3+UqgmxFvpKQ+lmLapfSQVD414awl2 Svb9IBfRnse/uchOsWeujlNVQl/9UmDEpoorHUBBiRt2QWcTSoVQpvTip9axbyh8 mQdodENNbU9F5VrowZMV6eWCtv7C6gU9C6aQQ7odJM3avIxEru6DyVu+Fs7RvDkb Qnh0xzE6pz69atu2xpwxk3WIMw4/qRsntHhnm1tT4368iw9+ftB/MhEeEonXcbAC LMZu1Dlf+3Mwv9HadJhtA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrudeijedgkeefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepkfffgggfuffvfhfhjggtgfesthejredttddvjeen ucfhrhhomheptegurhhirghnucfmlhgrvhgvrhcuoegrughrihgrnhdrkhhlrghvvghrse grkhhlrghvvghrrdgtohhmqeenucggtffrrghtthgvrhhnpeejheefvdehtdfghffgueeu vdfhjefhffdugeeufeelffeftefhhffftddtfefhheenucffohhmrghinhepmhhitghroh hsohhfthdrtghomhdpphhoshhtghhrvghsqhhlrdhorhhgnecuvehluhhsthgvrhfuihii vgeptdenucfrrghrrghmpehmrghilhhfrhhomheprggurhhirghnrdhklhgrvhgvrhesrg hklhgrvhgvrhdrtghomhdpnhgspghrtghpthhtohepvddpmhhouggvpehsmhhtphhouhht pdhrtghpthhtohepphhhihhlihhprdhhrgiivghluggvnhesghhmrghilhdrtghomhdprh gtphhtthhopehpghhsqhhlqdhgvghnvghrrghlsehpohhsthhgrhgvshhqlhdrohhrgh X-ME-Proxy: Feedback-ID: i76984098:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 9 Sep 2024 12:17:49 -0400 (EDT) Message-ID: <6f7c08ff-8ecd-4b86-acae-24a898183a49@aklaver.com> Date: Mon, 9 Sep 2024 09:17:48 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Clarify this MERGE warning? "Only columns from the target table that attempt to match data_source rows should appear in join_condition." To: Philip Hazelden , pgsql-general@postgresql.org References: Content-Language: en-US From: Adrian Klaver In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 9/9/24 06:02, Philip Hazelden wrote: > The MERGE docs[1] give this warning: > >> Only columns from the target table that attempt to match >> `data_source` rows should appear in `join_condition`. >> `join_condition` subexpressions that only reference the target >> table's columns can affect which action is taken, often in >> surprising ways. > > (The docs for upcoming v17 have the same line.) > > > So why should I avoid doing this? It's not clear to me whether the > warning is saying "this likely won't work like you expect because it's > difficult to reason about" or "because the behavior is unspecified" or > "because there's a bug" or what. > > I found a thread[2] on the psql-hackers list which has this snippet of > conversation: > >>>> * It might make sense to point out in the docs that join_condition >>>> should not filter the target table too much. Like SQL server docs say, >>>> don't put things in the join that filter the target that actually >>>> belong in the WHEN .. AND quals. In a way, this should be obvious, >>>> because it's an outer join. But I don't think it is, and ISTM that the >>>> sensible thing to do is to warn against it. FYI the SQL Server note does not shed any more light on this: https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-ver16 Caution It's important to specify only the columns from the target table to use for matching purposes. That is, specify columns from the target table that are compared to the corresponding column of the source table. Don't attempt to improve query performance by filtering out rows in the target table in the ON clause; for example, such as specifying AND NOT target_table.column_x = value. Doing so can return unexpected and incorrect results. > This makes me think the warning is trying to say something like: "if > you can move a subexpression from `join_condition` to `WHEN ... AND`, > you should probably do so". Is that right? > > (I still don't know *why* I should do that. It sounds like maybe it's > more efficient that way because this is an outer join? But I don't > know why that matters. If I'd had to guess which would be more > efficient, I'd have weakly guessed "prefer to do it in a join, indexes > will get used better that way".) I would say because you could end with WHEN clauses that are at odds with the JOIN clause. In other words you throw away rows in the JOIN that you need later in the WHEN(s). Basically keep rows around until you are sure they are not needed. > > And I think it's not always possible to move a subexpression without > changing `data_source`. In the example I posted above, I don't think > it would work, since we'd no longer have a `NOT MATCHED` on `t2`'s > `(3, '1.3 v2')` row. > > [1]: https://www.postgresql.org/docs/current/sql-merge.html > [2]: https://www.postgresql.org/message-id/flat/CANP8%2BjKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc%2BXrz8QB0nXA%40mail.gmail.com > > -- Adrian Klaver adrian.klaver@aklaver.com