public inbox for [email protected]  
help / color / mirror / Atom feed
From: David Rowley <[email protected]>
To: Matthias van de Meent <[email protected]>
Cc: jian he <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: SQL-level pg_datum_image_equal
Date: Thu, 26 Mar 2026 09:40:08 +1300
Message-ID: <CAApHDvpC8sj-ayXMQsh5DfnZV+X2QtUdvQ3Qr=81bcyaW7gVFw@mail.gmail.com> (raw)
In-Reply-To: <CAEze2WjUyYhcUwzaPiQbe-xBB_knbnG8Xr_9ACL7HrGVx=Vydw@mail.gmail.com>
References: <CAEze2WhsqYjg0oGY+7yooimUK7zRc9PY9u8u-Oo=VmJ+DAAkKg@mail.gmail.com>
	<CACJufxFV5KqmF86upFmX2tPzfKMWjWNf8+uYiYufLKatOsYXQw@mail.gmail.com>
	<CAEze2Wg4Fj+9LQOv=J5cqwapD8vz3oDh1B0iyX7RwesvseH9gg@mail.gmail.com>
	<CAEze2WjUyYhcUwzaPiQbe-xBB_knbnG8Xr_9ACL7HrGVx=Vydw@mail.gmail.com>

On Thu, 26 Mar 2026 at 03:46, Matthias van de Meent
<[email protected]> wrote:
> Attached is v2, which adds the new function to the docs, in addition
> to rebasing the patch onto master.

In [1] I was looking into reported bugs with the datum_image code not
behaving correctly. It turns out this is due to hash_numeric() using
the wrong PG_RETURN_* macro, resulting in it not correctly
sign-extending negative 32-bit ints. I did some analysis to see if
there were any user-visible changes from fixing that in the back
branches. I didn't find any.  What you're adding here would add one. I
think we might need to become more strict about using the correct
return macro before we expose this stuff, else it's going to be a
rather scary process to fix bugs when it could affect the
datum_image_equal result.

The thing about Memoize slowly finding all the bugs in the
datum_image_* code is that we're free to fix these bugs, as the use
case (the Memoize hash table) only lasts as long as the query.  If we
expose all of this to users, then they could persist things in tables.
6911f8037 fixed another problem that resulted in the datum_image code
not doing the right thing.

I also proposed a hack in [2] to fix the sign-extension problem. I had
hoped that might be temporary, but if we were to expose this function
at the SQL level, then I doubt we'd be brave enough to remove it.

Consider the results of the following 2 queries, which only differ in
that one uses a materialized CTE and the other does not.

with cte(hash) as materialized (select hash_numeric(n) from
(values('1234.124'::numeric),('1234.124'::numeric)) n(n))
select * from cte where pg_datum_image_equal(hash,
hash_numeric('1234.124'::numeric)); -- wrong results

with cte(hash) as (select hash_numeric(n) from
(values('1234.124'::numeric),('1234.124'::numeric)) n(n))
select * from cte where pg_datum_image_equal(hash,
hash_numeric('1234.124'::numeric));

 hash
------
(0 rows)

    hash
------------
 -612512148
 -612512148
(2 rows)

This would work correctly if I pushed the patch in [2]. But as of yet,
I'm not sure about it. I at least think we should delay doing this
until we've come up with a fix that we're confident in.

David

[1] https://postgr.es/m/CAApHDvrRR1VOk4i4CpNWeL48veFshfRAvDTuWxsiUhUqo0akwQ%40mail.gmail.com
[2] https://postgr.es/m/CAApHDvreF-UiqBaHtRTQWQ6z1X9snstJW%2Bdfb2DU5GOb-uPEBg%40mail.gmail.com





view thread (11+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: SQL-level pg_datum_image_equal
  In-Reply-To: <CAApHDvpC8sj-ayXMQsh5DfnZV+X2QtUdvQ3Qr=81bcyaW7gVFw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox