On Tue, Apr 14, 2026 at 7:03 AM SCHOEMANS Maxime <maxime.schoemans@ulb.be> wrote:
Hi Haibo,

Thank you for picking this up again. I agree with the changes you made
in v5, in particular scoping the patch to the three strict operators and
reworking the tests to check plan structure rather than exact row counts.

Attached is v6 as a 3-patch series building on your v5.

Patch 1 is your range join selectivity patch with one small change: the
range_cmp_bounds result in the merge walk is stored in a local cmp
variable to avoid calling it twice per iteration, as jian he suggested.

Hi Maxime,

Thank you for working on this and for building on top of v5. 

I think patch 1 looks good to me. I do not see major issues there. One small note: the localized bool join_is_reversed; in my version was intentional. I left it that way because get_join_variables() wants a storage location, and I preferred to keep that use local and explicit rather than trying to reshape things around it.


Patch 2 adds the same estimation for multirange types, covering all type
combinations (multirange x multirange, multirange x range, range x
multirange). Since both range and multirange types use the same bound
histogram format and the same RangeBound representation, the core
algorithm is identical.

For patch 2, I am less convinced, especially for &&.

My concern is not so much whether the code runs, but whether the semantic argument is strong enough for the mixed range/multirange cases. The patch assumes that because range and multirange use the same bounds histogram format and the same RangeBound representation, the same estimator can be applied directly. I think that argument is much easier to make for << and >> than for &&.

For single ranges, && works nicely with the usual decomposition because if two single ranges do not overlap, then one must be entirely to the left or entirely to the right of the other. But for multiranges there is a third possibility: neither side is entirely left nor entirely right, and yet they still do not overlap because of an internal gap.

For example: 

A = {[1,2), [100,101)}

B = [50,60)

Here:

A << B is false
A >> B is false
A && B is also false

So for multiranges, “not left and not right” does not imply overlap in the same way it does for single ranges. That makes me worry that reusing the same estimator logic for &&, especially in mixed range/multirange cases, may overestimate overlap because the overall lower/upper bounds do not capture the internal holes.

So I think patch 2 still needs a stronger justification there, and probably more targeted tests around sparse multiranges / hole cases, especially for &&.


Patch 3 removes the duplication between rangetypes_selfuncs.c and
multirangetypes_selfuncs.c that Tom raised as a concern. It makes the
10 shared helper functions non-static, exports them via selfuncs.h,
and deletes the copies from the multirange file. This covers all the
pre-existing duplication between the two files, not just the functions
added in this patch set.

Regards,
Maxime

For patch 3, I agree with the motivation: the duplication between rangetypes_selfuncs.c and multirangetypes_selfuncs.c is not ideal. But I am not convinced that exporting those helpers via selfuncs.h is the right boundary.

My preference would be something tighter:

In other words, I agree that the duplication should be removed, but I think a backend-private internal header should be enough for that goal. I do not think we need to expand visibility more than necessary by moving these helpers out of the file-private space into a broader interface. 

Thanks again for working on this.

Regards,

Haibo