public inbox for [email protected]
help / color / mirror / Atom feedFrom: David Rowley <[email protected]>
To: Tender Wang <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Fix "could not find memoization table entry"
Date: Wed, 25 Mar 2026 13:31:30 +1300
Message-ID: <CAApHDvrRR1VOk4i4CpNWeL48veFshfRAvDTuWxsiUhUqo0akwQ@mail.gmail.com> (raw)
In-Reply-To: <CAApHDvrSSeLNWzQWWcLfZKWwHd=XocVs3abmTmvaSD_sx-dkNQ@mail.gmail.com>
References: <CAHewXNmcXVFdB9_WwA8Ez0P+m_TQy_KzYk5Ri5dvg+fuwjD_yw@mail.gmail.com>
<CAApHDvrSSeLNWzQWWcLfZKWwHd=XocVs3abmTmvaSD_sx-dkNQ@mail.gmail.com>
On Tue, 24 Mar 2026 at 13:12, David Rowley <[email protected]> wrote:
> It looks to me like the bug is in hash_numeric(). Seems like it has no
> idea what type it's meant to return. hash_numeric_extended() doesn't
> seem to be much better.
I recreated this locally and spent some time debugging to understand
the issue more clearly. Because hash_numeric() isn't being careful to
convert the Datum return value into int32, in cases where the Datum is
a number above 2^31, (which would cause an int32 to wrap to negative),
that isn't happening when the expression evaluation code is run and
the Memoize hash table is populated. During the eviction code, when
we re-lookup the hash table with the stored key later, we deform the
MinimalTuple for the Memoize key, because the Datum has been formed
into a MinimalTuple, it's been forced into the correct 32-bit
representation via store_att_byval()'s DatumGetInt32() call. The
values below are examples of what we're dealing with:
1) 3041168208 (0xB5448B50)
2) 18446744072455752528 (0xFFFFFFFFB5448B50)
#1 is the number that the expression evaluation code gets from
hash_numeric(). This is the value that's used to choose the hashtable
bucket. #2 is what we probe the hash table for after reading the
Memoize key from the MinimalTuple, which has wrapped to negative
correctly from being stored in the MinimalTuple via store_att_byval().
If we fix hash_numeric() so it casts its value to int32, it will
return #2, the sign-extended representation.
Obviously, we don't want to back-patch anything that would cause a
user-visible change in the return value of hash_numeric(), so I've
been experimenting to see if there's any way to get PostgreSQL to
output any value from hash_numeric() larger than 2^31 and I've been
unable to. I tried:
Experiment a:
create table bigint (a bigint);
insert into bigint select hash_numeric(n) from
(values('1234.124'::numeric),('1234.124'::numeric)) n(n);
In this case, the hash_numeric() value goes through int48() to cast it
to the bigint before storing it in the table. int48() does
PG_GETARG_INT32(0), which fixes the issue.
Experiment b:
I tried just getting PostgreSQL to output the data:
select hash_numeric(n) from
(values('1234.124'::numeric),('1234.124'::numeric)) n(n);
The larger than 2^31 makes it all the way to int4out() in this case,
but it gets fixed by int4out's PG_GETARG_INT32(0) call.
My 2 experiments aren't exactly exhaustive, so they've only slightly
reduced my concern level.
Does anyone else have any opinions on this one?
I'm also wondering about other places where we're using the wrong
return macro in functions. Maybe there's something we can do in debug
builds and put the return type in the fcinfo and add some Asserts to
the PG_RETURN_* macros.
David
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Fix "could not find memoization table entry"
In-Reply-To: <CAApHDvrRR1VOk4i4CpNWeL48veFshfRAvDTuWxsiUhUqo0akwQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox