public inbox for [email protected]
help / color / mirror / Atom feedFrom: Jacob Jackson <[email protected]>
To: Ron Johnson <[email protected]>
Cc: pgsql-generallists.postgresql.org <[email protected]>
Subject: Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique?
Date: Wed, 29 Oct 2025 20:06:15 -0400
Message-ID: <CAAiQw3yzw59FG19TKB=Fx5VekWY2QzYLWHHBXqiagfQU6MEB8A@mail.gmail.com> (raw)
In-Reply-To: <CANzqJaDrojumO16bBSWVDg65i5pQM0mdsojfrPO2UyP3t3mUkg@mail.gmail.com>
References: <CAAiQw3yBPrCw6ZLeTwVS4QhKDWgJkmmp9LnGPdodxeQmn=kqVg@mail.gmail.com>
<CANzqJaC3H9GaqQTKUS75_TTJGYjuGyT+jNf+VLG7a9vy_GwhRw@mail.gmail.com>
<CAAiQw3ysEm38FA7ZuMKRnui1X17sscjRWqEV8Gx4tz3zPRnxiw@mail.gmail.com>
<CANzqJaDrojumO16bBSWVDg65i5pQM0mdsojfrPO2UyP3t3mUkg@mail.gmail.com>
The performance/cost difference persists with identical keys. Why would the
planner for the memorized query use different statistics/calculations in
the first place?
On Wed, Oct 29, 2025 at 5:48 PM Ron Johnson <[email protected]> wrote:
> You're running slightly different queries:
> enable_memoize=on: Index Cond: ("user" = '0'::bigint)
> enable_memoize=off: Index Cond: ("user" = '3477145805513'::bigint)
>
> All buffer counts look to be the same on both, and 514 is just 1.5%
> smaller than 522. That looks like statistical noise to me.
>
>
> On Wed, Oct 29, 2025 at 5:34 PM Jacob Jackson <[email protected]>
> wrote:
>
>> I was curious to see whether there was any reason I wasn't seeing for
>> Postgres to decide the memoized version was lower cost and try to memoize
>> these operations.
>>
>> On Wed, Oct 29, 2025 at 3:20 PM Ron Johnson <[email protected]>
>> wrote:
>>
>>> What's the actual problem? Does enable_memoize=on return incorrect
>>> results?
>>>
>>> Because a 45 microsecond (yes, 45 microseconds: 0.138 milliseconds = 138
>>> microseconds; same for the others) slowdown isn't something I'd get too
>>> worked up about.
>>>
>>> On Wed, Oct 29, 2025 at 2:29 PM Jacob Jackson <[email protected]>
>>> wrote:
>>>
>>>> Hello. I was looking at some query plans recently and noticed something
>>>> that didn't make sense. I have a query that joins a table of questions with
>>>> results for each question (using a table with a composite primary key of
>>>> question id and a user id), filtered by user id. The question IDs and the
>>>> combined question-userIds are guaranteed unique due to being primary keys,
>>>> and yet Postgres still memoizes the inner loop results. Any ideas why? Is
>>>> this just a failure of the query planner (I would be happy to explore
>>>> creating a PR), did I not properly guarantee uniqueness, or is there
>>>> another reason for memoization? The memoized version is consistently
>>>> slightly slower in my testing, despite the calculated cost being lower.
>>>> Here are the query plans:
>>>>
>>>> enable_memoize=on:
>>>>
>>>> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM questions LEFT JOIN
>>>> "QuestionUserStatus" ON questions.id = "QuestionUserStatus".question
>>>> WHERE "QuestionUserStatus".user = 0;
>>>>
>>>> Nested Loop (cost=0.71..514.09 rows=277 width=1381) (actual
>>>> time=0.021..0.520 rows=231 loops=1)
>>>> ├ Buffers: shared hit=859
>>>> ├ Index Scan using "QuestionUserStatus_user_question_pk" on
>>>> "QuestionUserStatus" (cost=0.42..178.88 rows=277 width=18) (actual
>>>> time=0.014..0.114 rows=231 loops=1)
>>>> │ ├ Index Cond: ("user" = '0'::bigint)
>>>> │ └ Buffers: shared hit=166
>>>> └ Memoize (cost=0.29..1.25 rows=1 width=1363) (actual
>>>> time=0.001..0.001 rows=1 loops=231)
>>>> ├ Cache Key: "QuestionUserStatus".question
>>>> ├ Cache Mode: logical
>>>> ├ Hits: 0 Misses: 231 Evictions: 0 Overflows: 0 Memory Usage:
>>>> 320kB
>>>> ├ Buffers: shared hit=693
>>>> └ Index Scan using questions_pkey on questions (cost=0.28..1.24
>>>> rows=1 width=1363) (actual time=0.001..0.001 rows=1 loops=231)
>>>> ├ Index Cond: (id = "QuestionUserStatus".question)
>>>> └ Buffers: shared hit=693
>>>> Planning:
>>>> └ Buffers: shared hit=6
>>>> Planning Time: 0.183 ms
>>>> Execution Time: 0.548 ms
>>>>
>>>> enable_memoize=off:
>>>>
>>>> Nested Loop (cost=0.70..521.98 rows=277 width=1381) (actual
>>>> time=0.018..0.421 rows=231 loops=1)
>>>> ├ Buffers: shared hit=859
>>>> ├ Index Scan using "QuestionUserStatus_user_question_pk" on
>>>> "QuestionUserStatus" (cost=0.42..178.88 rows=277 width=18) (actual
>>>> time=0.014..0.099 rows=231 loops=1)
>>>> │ ├ Index Cond: ("user" = '3477145805513'::bigint)
>>>> │ └ Buffers: shared hit=166
>>>> └ Index Scan using questions_pkey on questions (cost=0.28..1.24 rows=1
>>>> width=1363) (actual time=0.001..0.001 rows=1 loops=231)
>>>> ├ Index Cond: (id = "QuestionUserStatus".question)
>>>> └ Buffers: shared hit=693
>>>> Planning:
>>>> └ Buffers: shared hit=6
>>>> Planning Time: 0.197 ms
>>>> Execution Time: 0.444 ms
>>>>
>>>> Thanks for the help,
>>>> Jacob
>>>>
>>>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>
view thread (5+ messages)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Postgres memoizing inner loop of join when outer loop join key is guaranteed unique?
In-Reply-To: <CAAiQw3yzw59FG19TKB=Fx5VekWY2QzYLWHHBXqiagfQU6MEB8A@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox