public inbox for [email protected]  
help / color / mirror / Atom feed
From: Andrei Lepikhov <[email protected]>
To: Tom Lane <[email protected]>
Cc: [email protected]
Cc: Peter Eisentraut <[email protected]>
Subject: Re: Hashed SAOP on composite type with non-hashable column errors at runtime
Date: Sun, 7 Jun 2026 19:56:32 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>

On 05/06/2026 20:12, Tom Lane wrote:
> So I'm unexcited about putting the fix for this into
> convert_saop_to_hashed_saop_walker as you've done here.
> I think it needs to be addressed at the level of the relevant
> lsyscache.c lookup functions, so that there's some chance that
> future code additions will get this right.  Draft fix attached.

Thanks for your efforts!
Now, hash_ok_operator and op_hashjoinable handle all four container-type
equality operators. Side way is a C extension that lets you create a custom type
that groups other types marked as HASHES. I started this research because I had
trouble redesigning my ‘statistics’ type [1], but here, using HASHES seems just
not to work for my custom type.

Fixes in the lookup_type_cache related to the multirange type are also correct
for me. As well as pg_operator.dat changes.

> 
> I can't get excited about the test case you suggest;
> it's rather expensive and it will do nothing whatever
> to guard against future mistakes of the same kind.

Ok, let me think about that a little more.

> 
> I'm also unexcited about your 0002 and 0003.  

I understand about 0003, but what is the problem with 0002? In practice, people
use massive arrays (I’ve seen thousands of elements). You might remember my
complaint about planner’s memory consumption on array selectivity estimation a
couple of years ago - that time you proposed local planning memory context. So,
it’d be nice to see (as with Subplans) whether the SAOP is not hashed for a reason.

> I don't really care about optimizing the anonymous-record case; by and large,
> it's coincidental that complicated operations work at all on
> anonymous record types.

Got it. My actual care here is to provide a way (if possible) for extension
developers to fix this problem in ORM systems where they can't change the
complex application, but have an access pattern and will see regressions, as
they struggle with regressions each time after the introduction of a brand-new
query tree rewriting rule ;).

Note on the ‘lefthashfunc == righthashfunc’ condition. It is correct, because we
can compare RECORDs with only identical types in corresponding positions on the
left and right side of the comparison operator:

if (att1->atttypid != att2->atttypid)
    ereport(ERROR, "cannot compare dissimilar column types %s and %s ...");

So, if someday typecache is extended to compare, let’s say, (int4, int8) and
(int4, numeric), this code should also be revised, right?


[1] https://github.com/danolivo/pg_track_optimizer/blob/main/rstats.h

-- 
regards, Andrei Lepikhov,
pgEdge






view thread (5+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Hashed SAOP on composite type with non-hashable column errors at runtime
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox