public inbox for [email protected]
help / color / mirror / Atom feedFrom: Jacob Jackson <[email protected]>
To: David Rowley <[email protected]>
Cc: [email protected]
Subject: Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique?
Date: Wed, 29 Oct 2025 20:09:14 -0400
Message-ID: <CAAiQw3w-uEZkz13Y2iU76CucRXBoR=kw6wOfq6gocT63dZLfYg@mail.gmail.com> (raw)
In-Reply-To: <CAApHDvoWuMGh4z+=vY_gh-Oyh_9ZXWJnhO2JY+f+h4s9PL88Jg@mail.gmail.com>
References: <CAAiQw3yBPrCw6ZLeTwVS4QhKDWgJkmmp9LnGPdodxeQmn=kqVg@mail.gmail.com>
<CAApHDvoWuMGh4z+=vY_gh-Oyh_9ZXWJnhO2JY+f+h4s9PL88Jg@mail.gmail.com>
Thanks! This is very helpful.
On Wed, Oct 29, 2025 at 8:01 PM David Rowley <[email protected]> wrote:
> On Thu, 30 Oct 2025 at 07:29, Jacob Jackson <[email protected]>
> wrote:
> > Hello. I was looking at some query plans recently and noticed something
> that didn't make sense. I have a query that joins a table of questions with
> results for each question (using a table with a composite primary key of
> question id and a user id), filtered by user id. The question IDs and the
> combined question-userIds are guaranteed unique due to being primary keys,
> and yet Postgres still memoizes the inner loop results. Any ideas why?
>
> I agree that when the outer side of the join has unique values that
> Memoize does not make any sense. The planner currenrly puts quite a
> bit of faith in the row estimates for this and if you're getting this
> plan, then the estimates came back indicating there'd be fewer unique
> values of "QuestionUserStatus".question in the input than there are
> input rows to the Memoize node. If you delve into
> cost_memoize_rescan(), you'll see the code for this (look for where
> "hit_ratio" is calculated).
>
> There are also a few prechecks in get_memoize_path() to try to avoid
> this sort of thing, but unfortunately, the information to avoid
> Memoize when the outer side of the join is unique isn't available. We
> do have an "inner_unique" in JoinPathExtraData, but what we'd need for
> this and don't have is "outer_unique". If we had that, we could just
> exit early in get_memoize_path() if that's set to true. Whether or not
> going to the trouble of calculating "outer_unique" is worth the
> trouble, I'm not sure. There was some work on UniqueKeys a few years
> ago, which could have helped in this scenario as we could have more
> easily identified uniqueness at different join levels. That's no
> longer being worked on, as I understand it.
>
> On the other hand, it may be better to somehow enhance
> estimate_num_groups() so it can be given more details about the
> context of the request, i.e the set of Relids that are joined already
> for the input_rows. That way the code could do more analysis into the
> RelOptInfo base quals for the relevant relations. Extended statistics
> for n_distinct could also be applied in some cases too by looking for
> baserestrictinfo with equality quals or EquivalenceClasses with
> ec_has_const = true and a member for other Vars/Exprs in the extended
> statistics.
>
> Unfortunately, neither of these is a trivial fix.
>
> David
>
view thread (2+ messages)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique?
In-Reply-To: <CAAiQw3w-uEZkz13Y2iU76CucRXBoR=kw6wOfq6gocT63dZLfYg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox