Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vEGEN-008XRd-SG for pgsql-general@arkaria.postgresql.org; Thu, 30 Oct 2025 00:09:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vEGEM-004ggw-Pm for pgsql-general@arkaria.postgresql.org; Thu, 30 Oct 2025 00:09:29 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vEGEM-004ggm-Bs for pgsql-general@lists.postgresql.org; Thu, 30 Oct 2025 00:09:29 +0000 Received: from mail-ed1-x52f.google.com ([2a00:1450:4864:20::52f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vEGEJ-004TKS-13 for pgsql-general@lists.postgresql.org; Thu, 30 Oct 2025 00:09:28 +0000 Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-63bea08a326so532476a12.3 for ; Wed, 29 Oct 2025 17:09:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761782966; x=1762387766; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GkJiiS2xFe/hVNZbc2v1bdWQISOJdLiAS9ajhAlxhx4=; b=WZvdfYCBbsv8kpogy50nEbvcvc01gMWY5NQwiwva7rlQks6vR7yKaiLym8TvXMCukq IIjtCGmCG0Bhs8TTRs9FmtS4o/Lu6/MQbBhaWhF+bhB2JndRtNeN0avukETaFR4pO5X9 fj1s7urEOpYvU5TzaE5xIswdf4RJfPmsSJR6vuybiJs+FYujOVTnHPCdLQ6R+hdSMt4p 4H7Yc57zhoI7NzJjfFzAj0ULyccPVGeHY2dgynwT03/t3Dz+Q6gZiy8lYlfZqQC+saQ1 8agEzx3pbAwcj2VUZ7iYs/oiPE7RdjRlGRvZzP42IZASOdLyxl6u1EXond7q48ONaXfc q/ow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761782966; x=1762387766; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GkJiiS2xFe/hVNZbc2v1bdWQISOJdLiAS9ajhAlxhx4=; b=Me18do7sr7NuhEHbCk9diYrx2wFCX3TjoGgrLb2iEBgzYi+WYvXDuVrMWi6R369o1y /oUgaaHnpeZ/ZZM4NyoREDCQUA0qgYYSPojEgnCKkKi1WCdF5ivBohfI6pp0nJ5SGzd/ 9ylsv+HY6M3qPGK1DfgAhXo6Z8HLbK0df2aXTUqX5lcTvRCnT+0RtrYhJOdJi3j2HNIc BwEdGFkdKk/Wg3ZxV4qVJbbgu6L6eHmH4e11SLSiFF8F6EuyZ6f2oK/2DAhaUq1gYVc2 yHcNBkeeM5KwkOhnULJdnxCaKL9GR6P3LivqugIfWP2BV4Vwjz/dReO1wOpN/HloZOdb vgyg== X-Gm-Message-State: AOJu0Yy4WaBXh18OiGjD7FzgH68dUqKMXYDo1m3kQG8y0eqRx1RNcUgf p2YITuL2/lM3g73je0ne4iueydevAhx7bA9LFI+fA2Oqxeg9khWWidrECUsR1iHfbgw1/U6wVo/ YbpZi6gOobfzD00fOWohoD+SFHJ6eBSK2cg== X-Gm-Gg: ASbGncuoWmDBtshJ71GrlsuPqHZB8703aYXH8fkV/JtSjDSRxPLOKgNB1Jz8bVs9MGw l+YSHyn37r3aqUpwb+/laCXrcXaL8OYSXFzfwaqsdZMHEcNerCB27WO8FZy9i07AVWK8amlfJiT hH3viinzXCACQBlsRCaP8e5vmv0TXbzWazk1FQJCVVIJw8Y7Bxr4kVtfVy30JUl4Vypovm3grUu 53bmcdTVT9WaqJsW0VVt5oherKGzYc+vcERi4J44aXDv1+tD4RDWJTZX8k= X-Google-Smtp-Source: AGHT+IGAz77/hQ0m8KA+X/ZaJd3xGSlmYVXIMuSMVyy5Qb82tWvJAEueLYs+D5VOJjggxNXPelZABqo42YjhnEyys7k= X-Received: by 2002:a05:6402:35d5:b0:63b:f5cb:e1fa with SMTP id 4fb4d7f45d1cf-640441aea41mr3483380a12.11.1761782965430; Wed, 29 Oct 2025 17:09:25 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Jacob Jackson Date: Wed, 29 Oct 2025 20:09:14 -0400 X-Gm-Features: AWmQ_blHayJF_opThGjXwgOUcWZhjim1j6KRMkH3RbUcd2xXRQTVkz0LvkKdNLo Message-ID: Subject: Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique? To: David Rowley Cc: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000d36ef80642550f0c" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000d36ef80642550f0c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks! This is very helpful. On Wed, Oct 29, 2025 at 8:01=E2=80=AFPM David Rowley = wrote: > On Thu, 30 Oct 2025 at 07:29, Jacob Jackson > wrote: > > Hello. I was looking at some query plans recently and noticed something > that didn't make sense. I have a query that joins a table of questions wi= th > results for each question (using a table with a composite primary key of > question id and a user id), filtered by user id. The question IDs and the > combined question-userIds are guaranteed unique due to being primary keys= , > and yet Postgres still memoizes the inner loop results. Any ideas why? > > I agree that when the outer side of the join has unique values that > Memoize does not make any sense. The planner currenrly puts quite a > bit of faith in the row estimates for this and if you're getting this > plan, then the estimates came back indicating there'd be fewer unique > values of "QuestionUserStatus".question in the input than there are > input rows to the Memoize node. If you delve into > cost_memoize_rescan(), you'll see the code for this (look for where > "hit_ratio" is calculated). > > There are also a few prechecks in get_memoize_path() to try to avoid > this sort of thing, but unfortunately, the information to avoid > Memoize when the outer side of the join is unique isn't available. We > do have an "inner_unique" in JoinPathExtraData, but what we'd need for > this and don't have is "outer_unique". If we had that, we could just > exit early in get_memoize_path() if that's set to true. Whether or not > going to the trouble of calculating "outer_unique" is worth the > trouble, I'm not sure. There was some work on UniqueKeys a few years > ago, which could have helped in this scenario as we could have more > easily identified uniqueness at different join levels. That's no > longer being worked on, as I understand it. > > On the other hand, it may be better to somehow enhance > estimate_num_groups() so it can be given more details about the > context of the request, i.e the set of Relids that are joined already > for the input_rows. That way the code could do more analysis into the > RelOptInfo base quals for the relevant relations. Extended statistics > for n_distinct could also be applied in some cases too by looking for > baserestrictinfo with equality quals or EquivalenceClasses with > ec_has_const =3D true and a member for other Vars/Exprs in the extended > statistics. > > Unfortunately, neither of these is a trivial fix. > > David > --000000000000d36ef80642550f0c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks! This is very helpful.

On Wed,= Oct 29, 2025 at 8:01=E2=80=AFPM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 30 Oct 2025 at 07:29, Jacob Jack= son <jej.j= ackson.08@gmail.com> wrote:
> Hello. I was looking at some query plans recently and noticed somethin= g that didn't make sense. I have a query that joins a table of question= s with results for each question (using a table with a composite primary ke= y of question id and a user id), filtered by user id. The question IDs and = the combined question-userIds are guaranteed unique due to being primary ke= ys, and yet Postgres still memoizes the inner loop results. Any ideas why?<= br>
I agree that when the outer side of the join has unique values that
Memoize does not make any sense. The planner currenrly puts quite a
bit of faith in the row estimates for this and if you're getting this plan, then the estimates came back indicating there'd be fewer unique values of "QuestionUserStatus".question in the input than there a= re
input rows to the Memoize node. If you delve into
cost_memoize_rescan(), you'll see the code for this (look for where
"hit_ratio" is calculated).

There are also a few prechecks in get_memoize_path() to try to avoid
this sort of thing, but unfortunately, the information to avoid
Memoize when the outer side of the join is unique isn't available.=C2= =A0 We
do have an "inner_unique" in JoinPathExtraData, but what we'd= need for
this and don't have is "outer_unique". If we had that, we cou= ld just
exit early in get_memoize_path() if that's set to true. Whether or not<= br> going to the trouble of calculating "outer_unique" is worth the trouble, I'm not sure. There was some work on UniqueKeys a few years ago, which could have helped in this scenario as we could have more
easily identified uniqueness at different join levels. That's no
longer being worked on, as I understand it.

On the other hand, it may be better to somehow enhance
estimate_num_groups() so it can be given more details about the
context of the request, i.e the set of Relids that are joined already
for the input_rows. That way the code could do more analysis into the
RelOptInfo base quals for the relevant relations. Extended statistics
for n_distinct could also be applied in some cases too by looking for
baserestrictinfo with equality quals or EquivalenceClasses with
ec_has_const =3D true and a member for other Vars/Exprs in the extended
statistics.

Unfortunately, neither of these is a trivial fix.

David
--000000000000d36ef80642550f0c--