public inbox for [email protected]  
help / color / mirror / Atom feed
Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique?
2+ messages / 2 participants
[nested] [flat]

* Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique?
@ 2025-10-30 00:01 David Rowley <[email protected]>
  2025-10-30 00:09 ` Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique? Jacob Jackson <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: David Rowley @ 2025-10-30 00:01 UTC (permalink / raw)
  To: Jacob Jackson <[email protected]>; +Cc: [email protected]

On Thu, 30 Oct 2025 at 07:29, Jacob Jackson <[email protected]> wrote:
> Hello. I was looking at some query plans recently and noticed something that didn't make sense. I have a query that joins a table of questions with results for each question (using a table with a composite primary key of question id and a user id), filtered by user id. The question IDs and the combined question-userIds are guaranteed unique due to being primary keys, and yet Postgres still memoizes the inner loop results. Any ideas why?

I agree that when the outer side of the join has unique values that
Memoize does not make any sense. The planner currenrly puts quite a
bit of faith in the row estimates for this and if you're getting this
plan, then the estimates came back indicating there'd be fewer unique
values of "QuestionUserStatus".question in the input than there are
input rows to the Memoize node. If you delve into
cost_memoize_rescan(), you'll see the code for this (look for where
"hit_ratio" is calculated).

There are also a few prechecks in get_memoize_path() to try to avoid
this sort of thing, but unfortunately, the information to avoid
Memoize when the outer side of the join is unique isn't available.  We
do have an "inner_unique" in JoinPathExtraData, but what we'd need for
this and don't have is "outer_unique". If we had that, we could just
exit early in get_memoize_path() if that's set to true. Whether or not
going to the trouble of calculating "outer_unique" is worth the
trouble, I'm not sure. There was some work on UniqueKeys a few years
ago, which could have helped in this scenario as we could have more
easily identified uniqueness at different join levels. That's no
longer being worked on, as I understand it.

On the other hand, it may be better to somehow enhance
estimate_num_groups() so it can be given more details about the
context of the request, i.e the set of Relids that are joined already
for the input_rows. That way the code could do more analysis into the
RelOptInfo base quals for the relevant relations. Extended statistics
for n_distinct could also be applied in some cases too by looking for
baserestrictinfo with equality quals or EquivalenceClasses with
ec_has_const = true and a member for other Vars/Exprs in the extended
statistics.

Unfortunately, neither of these is a trivial fix.

David






^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique?
  2025-10-30 00:01 Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique? David Rowley <[email protected]>
@ 2025-10-30 00:09 ` Jacob Jackson <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Jacob Jackson @ 2025-10-30 00:09 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: [email protected]

Thanks! This is very helpful.

On Wed, Oct 29, 2025 at 8:01 PM David Rowley <[email protected]> wrote:

> On Thu, 30 Oct 2025 at 07:29, Jacob Jackson <[email protected]>
> wrote:
> > Hello. I was looking at some query plans recently and noticed something
> that didn't make sense. I have a query that joins a table of questions with
> results for each question (using a table with a composite primary key of
> question id and a user id), filtered by user id. The question IDs and the
> combined question-userIds are guaranteed unique due to being primary keys,
> and yet Postgres still memoizes the inner loop results. Any ideas why?
>
> I agree that when the outer side of the join has unique values that
> Memoize does not make any sense. The planner currenrly puts quite a
> bit of faith in the row estimates for this and if you're getting this
> plan, then the estimates came back indicating there'd be fewer unique
> values of "QuestionUserStatus".question in the input than there are
> input rows to the Memoize node. If you delve into
> cost_memoize_rescan(), you'll see the code for this (look for where
> "hit_ratio" is calculated).
>
> There are also a few prechecks in get_memoize_path() to try to avoid
> this sort of thing, but unfortunately, the information to avoid
> Memoize when the outer side of the join is unique isn't available.  We
> do have an "inner_unique" in JoinPathExtraData, but what we'd need for
> this and don't have is "outer_unique". If we had that, we could just
> exit early in get_memoize_path() if that's set to true. Whether or not
> going to the trouble of calculating "outer_unique" is worth the
> trouble, I'm not sure. There was some work on UniqueKeys a few years
> ago, which could have helped in this scenario as we could have more
> easily identified uniqueness at different join levels. That's no
> longer being worked on, as I understand it.
>
> On the other hand, it may be better to somehow enhance
> estimate_num_groups() so it can be given more details about the
> context of the request, i.e the set of Relids that are joined already
> for the input_rows. That way the code could do more analysis into the
> RelOptInfo base quals for the relevant relations. Extended statistics
> for n_distinct could also be applied in some cases too by looking for
> baserestrictinfo with equality quals or EquivalenceClasses with
> ec_has_const = true and a member for other Vars/Exprs in the extended
> statistics.
>
> Unfortunately, neither of these is a trivial fix.
>
> David
>


^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2025-10-30 00:09 UTC | newest]

Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-10-30 00:01 Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique? David Rowley <[email protected]>
2025-10-30 00:09 ` Jacob Jackson <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox