public inbox for [email protected]  
help / color / mirror / Atom feed
From: Alexander Nestorov <[email protected]>
To: Andrey Borodin <[email protected]>
Cc: pgsql-hackers mailing list <[email protected]>
Subject: Re: [PATCH] btree_gist: add cross-type integer operator support for GiST
Date: Sun, 14 Jun 2026 01:35:14 +0200
Message-ID: <80ef3b41-1a71-47a4-a320-29e118d7092c@Spark> (raw)
In-Reply-To: <18e88767-6c31-402a-887e-37c38b366a6a@Spark>
References: <aac10ffa-a0ca-4c49-846b-3655cbc6b37e@Spark>
	<36b4f67d-5975-452c-a6b8-b6407f0924ee@Spark>
	<[email protected]>
	<18e88767-6c31-402a-887e-37c38b366a6a@Spark>

Hello Andrey!

Following up with the things I owed you: the benchmarks, the consistency check
and adding a note for the 2^53 case.

I added a fast path. Each integer opclass's consistent() / distance() now
detects the "same type" case and calls the original gbt_num_consistent() /
gbt_num_distance() directly.

To confirm there's no regression I ran a microbenchmark on an -O2 build, no
asserts, single client, over a 500k row int4 GiST index, with the following
options:

-c enable_seqscan=off \
-c enable_bitmapscan=off \
-c enable_sort=off \
-c max_parallel_workers_per_gather=0

This is the base for the bench:

CREATE EXTENSION IF NOT EXISTS btree_gist;
DROP TABLE IF EXISTS benchg;
CREATE TABLE benchg (a int4);
INSERT INTO benchg SELECT g FROM generate_series(0, 499999) g;
CREATE INDEX benchg_idx ON benchg USING gist (a);
VACUUM (ANALYZE, FREEZE) benchg;

And the two workloads:

consistent(), full-range index-only count(*):
SELECT count(*) FROM benchg WHERE a >= 0 AND a <= 499999;

distance(), full KNN ordering (ORDER BY a<->k over all rows):
SELECT count(*) FROM (SELECT a FROM benchg ORDER BY a <-> 250000 LIMIT 1000000) q;

The numbers in ms (12 repetitions, 15s each) before
(3e3d7875e95621b02311ea3443e5139e3bce944a) and after my patch:

  before   consistent   min/med/mean = 51.754 52.718 54.137 ms
  after    consistent   min/med/mean = 52.042 52.480 52.572 ms
  ------------------------------------------------------------------------
  before   distance     min/med/mean = 76.863 77.177 77.395 ms
  after    distance     min/med/mean = 77.357 77.803 77.980 ms

All numbers seem to be within measurement noise, except the consistent-before,
which is probably inflated by one slow rep.

Regarding the other point, I explored the regression suite path I mentioned.

The consistent() / distance() functions dispatch cross-type queries through a
single static table of supported subtype OIDs (gbt_int_crosstype_table in
btree_utils_num.c). I expose that exact table to SQL, in gbt_int_crosstype_subtypes(),
so there is no hand-maintained second copy of the list.

The int_crosstype.sql regression test then builds the set of cross-type
(lefttype, righttype, strategy) entries that should exist in pg_amop from that
function, and EXCEPTs it against the cross-type rows actually present in
gist_int{2,4,8}_ops:

  - a pg_amop row whose subtype the C dispatch does not handle shows up as
    "unexpected in pg_amop", and
  - a dispatch entry without the matching pg_amop rows shows up as
    "missing from pg_amop".

Either kind of drift produces a diff under `make check`. So adding an ALTER
OPERATOR FAMILY entry without a matching dispatch entry (or vice versa) fails
the suite (as I mentioned in my previous email, I'm not aware of a way to do
this with amvalidate() without patching core).

I'm attaching the new set of patches (this time I include the tests).

Best regards!


Attachments:

  [application/octet-stream] 0001-Implement-cross-type-operators-for-GiST-indexes.patch (26.4K, 3-0001-Implement-cross-type-operators-for-GiST-indexes.patch)
  download | inline diff:
From 363b9c22874cdbb6cf780899c784c813bb10f94e Mon Sep 17 00:00:00 2001
From: Alexander Nestorov <[email protected]>
Date: Thu, 4 Jun 2026 00:06:54 +0200
Subject: [PATCH] Implement cross-type operators for GiST indexes

---
 contrib/btree_gist/Makefile                  |   4 +-
 contrib/btree_gist/btree_gist--1.9--1.10.sql | 143 +++++++++++++++++
 contrib/btree_gist/btree_gist.control        |   2 +-
 contrib/btree_gist/btree_int2.c              |  70 ++++++--
 contrib/btree_gist/btree_int4.c              |  70 ++++++--
 contrib/btree_gist/btree_int8.c              |  70 ++++++--
 contrib/btree_gist/btree_utils_num.c         | 160 +++++++++++++++++++
 contrib/btree_gist/btree_utils_num.h         |  16 ++
 contrib/btree_gist/meson.build               |   2 +
 9 files changed, 501 insertions(+), 36 deletions(-)
 create mode 100644 contrib/btree_gist/btree_gist--1.9--1.10.sql

diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
index fbbbca95598..1d0668d97ab 100644
--- a/contrib/btree_gist/Makefile
+++ b/contrib/btree_gist/Makefile
@@ -32,19 +32,19 @@ OBJS =  \
 EXTENSION = btree_gist
 DATA = btree_gist--1.0--1.1.sql \
        btree_gist--1.1--1.2.sql btree_gist--1.2--1.3.sql \
        btree_gist--1.3--1.4.sql btree_gist--1.4--1.5.sql \
        btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql \
        btree_gist--1.7--1.8.sql btree_gist--1.8--1.9.sql \
-       btree_gist--1.9.sql
+       btree_gist--1.9.sql btree_gist--1.9--1.10.sql
 PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"
 
 REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \
         time timetz date interval macaddr macaddr8 inet cidr text varchar char \
         bytea bit varbit numeric uuid not_equal enum bool partitions \
-        stratnum without_overlaps
+        stratnum int_crosstype without_overlaps
 
 SHLIB_LINK += $(filter -lm, $(LIBS))
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
 PGXS := $(shell $(PG_CONFIG) --pgxs)
diff --git a/contrib/btree_gist/btree_gist--1.9--1.10.sql b/contrib/btree_gist/btree_gist--1.9--1.10.sql
new file mode 100644
index 00000000000..c9ff1955204
--- /dev/null
+++ b/contrib/btree_gist/btree_gist--1.9--1.10.sql
@@ -0,0 +1,143 @@
+/* contrib/btree_gist/btree_gist--1.9--1.10.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.10'" to load this file. \quit
+
+-- Add cross-type operator support for the integer trio (int2, int4, int8)
+-- to the existing GiST operator families.
+--
+-- GiST's amvalidate requires support functions in a family to have matching
+-- left/right input types, so the catalog additions below are deliberately
+-- pg_amop-only. The existing consistent/distance support functions dispatch
+-- on the subtype OID: same-type queries take the normal path, while mixed-width
+-- integer queries are promoted to int64 and compared by the integer cross-type
+-- helpers in btree_utils_num.c.
+
+CREATE FUNCTION int2_int4_dist(int2, int4)
+RETURNS int4
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int4_int2_dist(int4, int2)
+RETURNS int4
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int2_int8_dist(int2, int8)
+RETURNS int8
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int8_int2_dist(int8, int2)
+RETURNS int8
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int4_int8_dist(int4, int8)
+RETURNS int8
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int8_int4_dist(int8, int4)
+RETURNS int8
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+-- Introspection helper exposing the integer query subtypes understood by the C
+-- cross-type dispatch (gbt_int_crosstype_table in btree_utils_num.c). The
+-- regression tests use it to assert that the cross-type pg_amop entries below
+-- never drift from what the dispatch can handle.
+CREATE FUNCTION gbt_int_crosstype_subtypes()
+RETURNS SETOF oid
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE PARALLEL SAFE;
+
+CREATE OPERATOR <-> (
+	LEFTARG = int2,
+	RIGHTARG = int4,
+	PROCEDURE = int2_int4_dist,
+	COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+	LEFTARG = int4,
+	RIGHTARG = int2,
+	PROCEDURE = int4_int2_dist,
+	COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+	LEFTARG = int2,
+	RIGHTARG = int8,
+	PROCEDURE = int2_int8_dist,
+	COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+	LEFTARG = int8,
+	RIGHTARG = int2,
+	PROCEDURE = int8_int2_dist,
+	COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+	LEFTARG = int4,
+	RIGHTARG = int8,
+	PROCEDURE = int4_int8_dist,
+	COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+	LEFTARG = int8,
+	RIGHTARG = int4,
+	PROCEDURE = int8_int4_dist,
+	COMMUTATOR = '<->'
+);
+
+ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD
+	OPERATOR	1	<  (int2, int4),
+	OPERATOR	2	<= (int2, int4),
+	OPERATOR	3	=  (int2, int4),
+	OPERATOR	4	>= (int2, int4),
+	OPERATOR	5	>  (int2, int4),
+	OPERATOR	6	<> (int2, int4),
+	OPERATOR	15	<-> (int2, int4) FOR ORDER BY pg_catalog.integer_ops,
+	OPERATOR	1	<  (int2, int8),
+	OPERATOR	2	<= (int2, int8),
+	OPERATOR	3	=  (int2, int8),
+	OPERATOR	4	>= (int2, int8),
+	OPERATOR	5	>  (int2, int8),
+	OPERATOR	6	<> (int2, int8),
+	OPERATOR	15	<-> (int2, int8) FOR ORDER BY pg_catalog.integer_ops;
+
+ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD
+	OPERATOR	1	<  (int4, int2),
+	OPERATOR	2	<= (int4, int2),
+	OPERATOR	3	=  (int4, int2),
+	OPERATOR	4	>= (int4, int2),
+	OPERATOR	5	>  (int4, int2),
+	OPERATOR	6	<> (int4, int2),
+	OPERATOR	15	<-> (int4, int2) FOR ORDER BY pg_catalog.integer_ops,
+	OPERATOR	1	<  (int4, int8),
+	OPERATOR	2	<= (int4, int8),
+	OPERATOR	3	=  (int4, int8),
+	OPERATOR	4	>= (int4, int8),
+	OPERATOR	5	>  (int4, int8),
+	OPERATOR	6	<> (int4, int8),
+	OPERATOR	15	<-> (int4, int8) FOR ORDER BY pg_catalog.integer_ops;
+
+ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD
+	OPERATOR	1	<  (int8, int2),
+	OPERATOR	2	<= (int8, int2),
+	OPERATOR	3	=  (int8, int2),
+	OPERATOR	4	>= (int8, int2),
+	OPERATOR	5	>  (int8, int2),
+	OPERATOR	6	<> (int8, int2),
+	OPERATOR	15	<-> (int8, int2) FOR ORDER BY pg_catalog.integer_ops,
+	OPERATOR	1	<  (int8, int4),
+	OPERATOR	2	<= (int8, int4),
+	OPERATOR	3	=  (int8, int4),
+	OPERATOR	4	>= (int8, int4),
+	OPERATOR	5	>  (int8, int4),
+	OPERATOR	6	<> (int8, int4),
+	OPERATOR	15	<-> (int8, int4) FOR ORDER BY pg_catalog.integer_ops;
diff --git a/contrib/btree_gist/btree_gist.control b/contrib/btree_gist/btree_gist.control
index 69d9341a0ad..e606fa6551d 100644
--- a/contrib/btree_gist/btree_gist.control
+++ b/contrib/btree_gist/btree_gist.control
@@ -1,6 +1,6 @@
 # btree_gist extension
 comment = 'support for indexing common datatypes in GiST'
-default_version = '1.9'
+default_version = '1.10'
 module_pathname = '$libdir/btree_gist'
 relocatable = true
 trusted = true
diff --git a/contrib/btree_gist/btree_int2.c b/contrib/btree_gist/btree_int2.c
index cc4b33177e3..d00b24d9a57 100644
--- a/contrib/btree_gist/btree_int2.c
+++ b/contrib/btree_gist/btree_int2.c
@@ -2,12 +2,13 @@
  * contrib/btree_gist/btree_int2.c
  */
 #include "postgres.h"
 
 #include "btree_gist.h"
 #include "btree_utils_num.h"
+#include "catalog/pg_type.h"
 #include "common/int.h"
 #include "utils/rel.h"
 #include "utils/sortsupport.h"
 
 typedef struct int16key
 {
@@ -73,13 +74,12 @@ gbt_int2key_cmp(const void *a, const void *b, FmgrInfo *flinfo)
 static float8
 gbt_int2_dist(const void *a, const void *b, FmgrInfo *flinfo)
 {
 	return GET_FLOAT_DISTANCE(int16, a, b);
 }
 
-
 static const gbtree_ninfo tinfo =
 {
 	gbt_t_int2,
 	sizeof(int16),
 	4,							/* sizeof(gbtreekey4) */
 	gbt_int2gt,
@@ -109,12 +109,46 @@ int2_dist(PG_FUNCTION_ARGS)
 
 	ra = abs(r);
 
 	PG_RETURN_INT16(ra);
 }
 
+PG_FUNCTION_INFO_V1(int2_int4_dist);
+Datum
+int2_int4_dist(PG_FUNCTION_ARGS)
+{
+	int32		a = (int32) PG_GETARG_INT16(0);
+	int32		b = PG_GETARG_INT32(1);
+	int32		r;
+
+	if (pg_sub_s32_overflow(a, b, &r) ||
+		r == PG_INT32_MIN)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("integer out of range")));
+
+	PG_RETURN_INT32(abs(r));
+}
+
+PG_FUNCTION_INFO_V1(int2_int8_dist);
+Datum
+int2_int8_dist(PG_FUNCTION_ARGS)
+{
+	int64		a = (int64) PG_GETARG_INT16(0);
+	int64		b = PG_GETARG_INT64(1);
+	int64		r;
+
+	if (pg_sub_s64_overflow(a, b, &r) ||
+		r == PG_INT64_MIN)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("bigint out of range")));
+
+	PG_RETURN_INT64(i64abs(r));
+}
+
 
 /**************************************************
  * GiST support functions
  **************************************************/
 
 Datum
@@ -134,47 +168,61 @@ gbt_int2_fetch(PG_FUNCTION_ARGS)
 }
 
 Datum
 gbt_int2_consistent(PG_FUNCTION_ARGS)
 {
 	GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
-	int16		query = PG_GETARG_INT16(1);
+	Datum		queryDatum = PG_GETARG_DATUM(1);
 	StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
-#ifdef NOT_USED
 	Oid			subtype = PG_GETARG_OID(3);
-#endif
 	bool	   *recheck = (bool *) PG_GETARG_POINTER(4);
 	int16KEY   *kkk = (int16KEY *) DatumGetPointer(entry->key);
+	int16		query;
 	GBT_NUMKEY_R key;
 
 	/* All cases served by this function are exact */
 	*recheck = false;
 
 	key.lower = (GBT_NUMKEY *) &kkk->lower;
 	key.upper = (GBT_NUMKEY *) &kkk->upper;
 
-	PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
-									  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
+	if (likely(subtype == InvalidOid || subtype == INT2OID))
+	{
+		query = DatumGetInt16(queryDatum);
+		PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
+										  GIST_LEAF(entry), &tinfo,
+										  fcinfo->flinfo));
+	}
+
+	PG_RETURN_BOOL(gbt_int_consistent_x((int64) kkk->lower, (int64) kkk->upper,
+										queryDatum, subtype, &strategy,
+										GIST_LEAF(entry)));
 }
 
 Datum
 gbt_int2_distance(PG_FUNCTION_ARGS)
 {
 	GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
-	int16		query = PG_GETARG_INT16(1);
-#ifdef NOT_USED
+	Datum		queryDatum = PG_GETARG_DATUM(1);
 	Oid			subtype = PG_GETARG_OID(3);
-#endif
 	int16KEY   *kkk = (int16KEY *) DatumGetPointer(entry->key);
+	int16		query;
 	GBT_NUMKEY_R key;
 
 	key.lower = (GBT_NUMKEY *) &kkk->lower;
 	key.upper = (GBT_NUMKEY *) &kkk->upper;
 
-	PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
-									  &tinfo, fcinfo->flinfo));
+	if (likely(subtype == InvalidOid || subtype == INT2OID))
+	{
+		query = DatumGetInt16(queryDatum);
+		PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
+										  &tinfo, fcinfo->flinfo));
+	}
+
+	PG_RETURN_FLOAT8(gbt_int_distance_x((int64) kkk->lower, (int64) kkk->upper,
+										queryDatum, subtype));
 }
 
 Datum
 gbt_int2_union(PG_FUNCTION_ARGS)
 {
 	GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
diff --git a/contrib/btree_gist/btree_int4.c b/contrib/btree_gist/btree_int4.c
index 47790578e6b..4df91f2057c 100644
--- a/contrib/btree_gist/btree_int4.c
+++ b/contrib/btree_gist/btree_int4.c
@@ -1,12 +1,13 @@
 /*
  * contrib/btree_gist/btree_int4.c
  */
 #include "postgres.h"
 #include "btree_gist.h"
 #include "btree_utils_num.h"
+#include "catalog/pg_type.h"
 #include "common/int.h"
 #include "utils/rel.h"
 #include "utils/sortsupport.h"
 
 typedef struct int32key
 {
@@ -71,13 +72,12 @@ gbt_int4key_cmp(const void *a, const void *b, FmgrInfo *flinfo)
 static float8
 gbt_int4_dist(const void *a, const void *b, FmgrInfo *flinfo)
 {
 	return GET_FLOAT_DISTANCE(int32, a, b);
 }
 
-
 static const gbtree_ninfo tinfo =
 {
 	gbt_t_int4,
 	sizeof(int32),
 	8,							/* sizeof(gbtreekey8) */
 	gbt_int4gt,
@@ -107,12 +107,46 @@ int4_dist(PG_FUNCTION_ARGS)
 
 	ra = abs(r);
 
 	PG_RETURN_INT32(ra);
 }
 
+PG_FUNCTION_INFO_V1(int4_int2_dist);
+Datum
+int4_int2_dist(PG_FUNCTION_ARGS)
+{
+	int32		a = PG_GETARG_INT32(0);
+	int32		b = (int32) PG_GETARG_INT16(1);
+	int32		r;
+
+	if (pg_sub_s32_overflow(a, b, &r) ||
+		r == PG_INT32_MIN)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("integer out of range")));
+
+	PG_RETURN_INT32(abs(r));
+}
+
+PG_FUNCTION_INFO_V1(int4_int8_dist);
+Datum
+int4_int8_dist(PG_FUNCTION_ARGS)
+{
+	int64		a = (int64) PG_GETARG_INT32(0);
+	int64		b = PG_GETARG_INT64(1);
+	int64		r;
+
+	if (pg_sub_s64_overflow(a, b, &r) ||
+		r == PG_INT64_MIN)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("bigint out of range")));
+
+	PG_RETURN_INT64(i64abs(r));
+}
+
 
 /**************************************************
  * GiST support functions
  **************************************************/
 
 Datum
@@ -132,47 +166,61 @@ gbt_int4_fetch(PG_FUNCTION_ARGS)
 }
 
 Datum
 gbt_int4_consistent(PG_FUNCTION_ARGS)
 {
 	GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
-	int32		query = PG_GETARG_INT32(1);
+	Datum		queryDatum = PG_GETARG_DATUM(1);
 	StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
-#ifdef NOT_USED
 	Oid			subtype = PG_GETARG_OID(3);
-#endif
 	bool	   *recheck = (bool *) PG_GETARG_POINTER(4);
 	int32KEY   *kkk = (int32KEY *) DatumGetPointer(entry->key);
+	int32		query;
 	GBT_NUMKEY_R key;
 
 	/* All cases served by this function are exact */
 	*recheck = false;
 
 	key.lower = (GBT_NUMKEY *) &kkk->lower;
 	key.upper = (GBT_NUMKEY *) &kkk->upper;
 
-	PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
-									  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
+	if (likely(subtype == InvalidOid || subtype == INT4OID))
+	{
+		query = DatumGetInt32(queryDatum);
+		PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
+										  GIST_LEAF(entry), &tinfo,
+										  fcinfo->flinfo));
+	}
+
+	PG_RETURN_BOOL(gbt_int_consistent_x((int64) kkk->lower, (int64) kkk->upper,
+										queryDatum, subtype, &strategy,
+										GIST_LEAF(entry)));
 }
 
 Datum
 gbt_int4_distance(PG_FUNCTION_ARGS)
 {
 	GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
-	int32		query = PG_GETARG_INT32(1);
-#ifdef NOT_USED
+	Datum		queryDatum = PG_GETARG_DATUM(1);
 	Oid			subtype = PG_GETARG_OID(3);
-#endif
 	int32KEY   *kkk = (int32KEY *) DatumGetPointer(entry->key);
+	int32		query;
 	GBT_NUMKEY_R key;
 
 	key.lower = (GBT_NUMKEY *) &kkk->lower;
 	key.upper = (GBT_NUMKEY *) &kkk->upper;
 
-	PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
-									  &tinfo, fcinfo->flinfo));
+	if (likely(subtype == InvalidOid || subtype == INT4OID))
+	{
+		query = DatumGetInt32(queryDatum);
+		PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
+										  &tinfo, fcinfo->flinfo));
+	}
+
+	PG_RETURN_FLOAT8(gbt_int_distance_x((int64) kkk->lower, (int64) kkk->upper,
+										queryDatum, subtype));
 }
 
 Datum
 gbt_int4_union(PG_FUNCTION_ARGS)
 {
 	GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
diff --git a/contrib/btree_gist/btree_int8.c b/contrib/btree_gist/btree_int8.c
index f48122c8d84..ae01273b709 100644
--- a/contrib/btree_gist/btree_int8.c
+++ b/contrib/btree_gist/btree_int8.c
@@ -2,12 +2,13 @@
  * contrib/btree_gist/btree_int8.c
  */
 #include "postgres.h"
 
 #include "btree_gist.h"
 #include "btree_utils_num.h"
+#include "catalog/pg_type.h"
 #include "common/int.h"
 #include "utils/rel.h"
 #include "utils/sortsupport.h"
 
 typedef struct int64key
 {
@@ -73,13 +74,12 @@ gbt_int8key_cmp(const void *a, const void *b, FmgrInfo *flinfo)
 static float8
 gbt_int8_dist(const void *a, const void *b, FmgrInfo *flinfo)
 {
 	return GET_FLOAT_DISTANCE(int64, a, b);
 }
 
-
 static const gbtree_ninfo tinfo =
 {
 	gbt_t_int8,
 	sizeof(int64),
 	16,							/* sizeof(gbtreekey16) */
 	gbt_int8gt,
@@ -109,12 +109,46 @@ int8_dist(PG_FUNCTION_ARGS)
 
 	ra = i64abs(r);
 
 	PG_RETURN_INT64(ra);
 }
 
+PG_FUNCTION_INFO_V1(int8_int2_dist);
+Datum
+int8_int2_dist(PG_FUNCTION_ARGS)
+{
+	int64		a = PG_GETARG_INT64(0);
+	int64		b = (int64) PG_GETARG_INT16(1);
+	int64		r;
+
+	if (pg_sub_s64_overflow(a, b, &r) ||
+		r == PG_INT64_MIN)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("bigint out of range")));
+
+	PG_RETURN_INT64(i64abs(r));
+}
+
+PG_FUNCTION_INFO_V1(int8_int4_dist);
+Datum
+int8_int4_dist(PG_FUNCTION_ARGS)
+{
+	int64		a = PG_GETARG_INT64(0);
+	int64		b = (int64) PG_GETARG_INT32(1);
+	int64		r;
+
+	if (pg_sub_s64_overflow(a, b, &r) ||
+		r == PG_INT64_MIN)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("bigint out of range")));
+
+	PG_RETURN_INT64(i64abs(r));
+}
+
 
 /**************************************************
  * GiST support functions
  **************************************************/
 
 Datum
@@ -134,47 +168,61 @@ gbt_int8_fetch(PG_FUNCTION_ARGS)
 }
 
 Datum
 gbt_int8_consistent(PG_FUNCTION_ARGS)
 {
 	GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
-	int64		query = PG_GETARG_INT64(1);
+	Datum		queryDatum = PG_GETARG_DATUM(1);
 	StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
-#ifdef NOT_USED
 	Oid			subtype = PG_GETARG_OID(3);
-#endif
 	bool	   *recheck = (bool *) PG_GETARG_POINTER(4);
 	int64KEY   *kkk = (int64KEY *) DatumGetPointer(entry->key);
+	int64		query;
 	GBT_NUMKEY_R key;
 
 	/* All cases served by this function are exact */
 	*recheck = false;
 
 	key.lower = (GBT_NUMKEY *) &kkk->lower;
 	key.upper = (GBT_NUMKEY *) &kkk->upper;
 
-	PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
-									  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
+	if (likely(subtype == InvalidOid || subtype == INT8OID))
+	{
+		query = DatumGetInt64(queryDatum);
+		PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
+										  GIST_LEAF(entry), &tinfo,
+										  fcinfo->flinfo));
+	}
+
+	PG_RETURN_BOOL(gbt_int_consistent_x(kkk->lower, kkk->upper,
+										queryDatum, subtype, &strategy,
+										GIST_LEAF(entry)));
 }
 
 Datum
 gbt_int8_distance(PG_FUNCTION_ARGS)
 {
 	GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
-	int64		query = PG_GETARG_INT64(1);
-#ifdef NOT_USED
+	Datum		queryDatum = PG_GETARG_DATUM(1);
 	Oid			subtype = PG_GETARG_OID(3);
-#endif
 	int64KEY   *kkk = (int64KEY *) DatumGetPointer(entry->key);
+	int64		query;
 	GBT_NUMKEY_R key;
 
 	key.lower = (GBT_NUMKEY *) &kkk->lower;
 	key.upper = (GBT_NUMKEY *) &kkk->upper;
 
-	PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
-									  &tinfo, fcinfo->flinfo));
+	if (likely(subtype == InvalidOid || subtype == INT8OID))
+	{
+		query = DatumGetInt64(queryDatum);
+		PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
+										  &tinfo, fcinfo->flinfo));
+	}
+
+	PG_RETURN_FLOAT8(gbt_int_distance_x(kkk->lower, kkk->upper,
+										queryDatum, subtype));
 }
 
 Datum
 gbt_int8_union(PG_FUNCTION_ARGS)
 {
 	GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
diff --git a/contrib/btree_gist/btree_utils_num.c b/contrib/btree_gist/btree_utils_num.c
index 3affe4c2c46..cff47e2d876 100644
--- a/contrib/btree_gist/btree_utils_num.c
+++ b/contrib/btree_gist/btree_utils_num.c
@@ -2,12 +2,14 @@
  * contrib/btree_gist/btree_utils_num.c
  */
 #include "postgres.h"
 
 #include "btree_gist.h"
 #include "btree_utils_num.h"
+#include "catalog/pg_type.h"
+#include "funcapi.h"
 #include "utils/cash.h"
 #include "utils/date.h"
 #include "utils/timestamp.h"
 
 
 GISTENTRY *
@@ -304,12 +306,144 @@ gbt_num_consistent(const GBT_NUMKEY_R *key,
 			retval = false;
 	}
 
 	return retval;
 }
 
+/*
+ * Cross-type dispatch table for the integer opclasses.
+ *
+ * This is the single source of truth for which query subtypes the integer
+ * cross-type path understands. gbt_int_query_to_int64() promotes exactly the
+ * subtypes listed here, and the SQL-visible gbt_int_crosstype_subtypes()
+ * exposes the same list so the regression tests can assert that the cross-type
+ * pg_amop entries in gist_int{2,4,8}_ops never drift from what the C code can
+ * actually handle.
+ */
+typedef int64 (*gbt_int_promote_fn) (Datum query);
+
+static int64
+gbt_int2_to_int64(Datum query)
+{
+	return (int64) DatumGetInt16(query);
+}
+
+static int64
+gbt_int4_to_int64(Datum query)
+{
+	return (int64) DatumGetInt32(query);
+}
+
+static int64
+gbt_int8_to_int64(Datum query)
+{
+	return DatumGetInt64(query);
+}
+
+typedef struct
+{
+	Oid			subtype;
+	gbt_int_promote_fn promote;
+}			gbt_int_crosstype;
+
+static const gbt_int_crosstype gbt_int_crosstype_table[] = {
+	{INT2OID, gbt_int2_to_int64},
+	{INT4OID, gbt_int4_to_int64},
+	{INT8OID, gbt_int8_to_int64},
+};
+
+/*
+ * Promote a cross-type integer query value to int64.
+ *
+ * A subtype outside gbt_int_crosstype_table means the C dispatch is out of sync
+ * with the operator-family registrations in pg_amop, so we treat it as an
+ * internal error.  The regression tests assert this never happens.
+ */
+static int64
+gbt_int_query_to_int64(Datum query, Oid subtype)
+{
+	int			i;
+
+	for (i = 0; i < lengthof(gbt_int_crosstype_table); i++)
+	{
+		if (gbt_int_crosstype_table[i].subtype == subtype)
+			return gbt_int_crosstype_table[i].promote(query);
+	}
+
+	elog(ERROR, "unrecognized subtype %u for btree_gist integer cross-type comparison",
+		 subtype);
+	return 0;					/* keep compiler quiet */
+}
+
+/*
+ * gbt_int_crosstype_subtypes
+ *
+ * Expose the contents of gbt_int_crosstype_table to SQL as a set of type OIDs.
+ * The btree_gist regression tests use this to check that the cross-type pg_amop
+ * entries in gist_int{2,4,8}_ops stay in agreement with the C dispatch.
+ */
+PG_FUNCTION_INFO_V1(gbt_int_crosstype_subtypes);
+Datum
+gbt_int_crosstype_subtypes(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+
+	if (SRF_IS_FIRSTCALL())
+		funcctx = SRF_FIRSTCALL_INIT();
+
+	funcctx = SRF_PERCALL_SETUP();
+
+	if (funcctx->call_cntr < lengthof(gbt_int_crosstype_table))
+	{
+		Oid			subtype = gbt_int_crosstype_table[funcctx->call_cntr].subtype;
+
+		SRF_RETURN_NEXT(funcctx, ObjectIdGetDatum(subtype));
+	}
+
+	SRF_RETURN_DONE(funcctx);
+}
+
+/*
+ * Cross-type consistent method for the integer opclasses.
+ *
+ * The key range [lower, upper] and the query value are all compared as int64.
+ * The strategy logic mirrors the same-type path in gbt_num_consistent(): the
+ * query value keeps its own width (no narrowing to the indexed column type),
+ * so out-of-range constants behave according to normal integer comparison.
+ */
+bool
+gbt_int_consistent_x(int64 lower, int64 upper, Datum query, Oid subtype,
+					 const StrategyNumber *strategy, bool is_leaf)
+{
+	int64		q = gbt_int_query_to_int64(query, subtype);
+
+	switch (*strategy)
+	{
+		case BTLessEqualStrategyNumber:
+			/* some k in [lower,upper] has k <= q iff lower <= q */
+			return lower <= q;
+		case BTLessStrategyNumber:
+			/* leaf: key < q. internal: lower <= q (loose) */
+			return is_leaf ? (lower < q) : (lower <= q);
+		case BTEqualStrategyNumber:
+			if (is_leaf)
+				return lower == q;
+			/* internal: lower <= q <= upper */
+			return (lower <= q && q <= upper);
+		case BTGreaterStrategyNumber:
+			/* leaf: key > q. internal: upper >= q (loose) */
+			return is_leaf ? (upper > q) : (upper >= q);
+		case BTGreaterEqualStrategyNumber:
+			return upper >= q;
+		case BtreeGistNotEqualStrategyNumber:
+			return !(lower == q && upper == q);
+		default:
+			return false;
+	}
+}
+
 
 /*
  * The GiST distance method (for KNN-Gist)
  */
 
 float8
@@ -331,12 +465,38 @@ gbt_num_distance(const GBT_NUMKEY_R *key,
 	else
 		retval = 0.0;
 
 	return retval;
 }
 
+/*
+ * Cross-type distance method for the integer opclasses.
+ *
+ * The distance from the query value to the key range is computed in int64 and
+ * returned as a float8, matching the same-type integer distance behaviour.
+ *
+ * Note that the subtraction is performed in float8, so when the key bound and
+ * the query value differ by more than 2^53 the returned distance loses
+ * precision. This only affects the ordering of KNN results that are nearly
+ * equidistant at that scale; it never changes which rows are returned.
+ */
+float8
+gbt_int_distance_x(int64 lower, int64 upper, Datum query, Oid subtype)
+{
+	int64		q = gbt_int_query_to_int64(query, subtype);
+
+	/* query below the range: distance to lower bound */
+	if (lower >= q)
+		return fabs((float8) lower - (float8) q);
+	/* query above the range: distance to upper bound */
+	if (upper <= q)
+		return fabs((float8) upper - (float8) q);
+	/* query inside the range */
+	return 0.0;
+}
+
 
 GIST_SPLITVEC *
 gbt_num_picksplit(const GistEntryVector *entryvec, GIST_SPLITVEC *v,
 				  const gbtree_ninfo *tinfo, FmgrInfo *flinfo)
 {
 	OffsetNumber i,
diff --git a/contrib/btree_gist/btree_utils_num.h b/contrib/btree_gist/btree_utils_num.h
index 53e477d8b1e..d378b0df37e 100644
--- a/contrib/btree_gist/btree_utils_num.h
+++ b/contrib/btree_gist/btree_utils_num.h
@@ -95,12 +95,28 @@ extern bool gbt_num_consistent(const GBT_NUMKEY_R *key, const void *query,
 							   const StrategyNumber *strategy, bool is_leaf,
 							   const gbtree_ninfo *tinfo, FmgrInfo *flinfo);
 
 extern float8 gbt_num_distance(const GBT_NUMKEY_R *key, const void *query,
 							   bool is_leaf, const gbtree_ninfo *tinfo, FmgrInfo *flinfo);
 
+/*
+ * Cross-type consistent/distance helpers for the integer opclasses
+ * (int2/int4/int8).  All three integer widths promote losslessly to int64, so
+ * a cross-type query reduces to a plain int64 comparison and there is no need
+ * for the per-type callback machinery the same-type path uses.  These helpers
+ * are deliberately integer-specific rather than a general cross-type framework;
+ * other type families should grow their own helpers shaped to their needs as
+ * the need arises.
+ */
+extern bool gbt_int_consistent_x(int64 lower, int64 upper,
+								 Datum query, Oid subtype,
+								 const StrategyNumber *strategy, bool is_leaf);
+
+extern float8 gbt_int_distance_x(int64 lower, int64 upper,
+								 Datum query, Oid subtype);
+
 extern GIST_SPLITVEC *gbt_num_picksplit(const GistEntryVector *entryvec, GIST_SPLITVEC *v,
 										const gbtree_ninfo *tinfo, FmgrInfo *flinfo);
 
 extern GISTENTRY *gbt_num_compress(GISTENTRY *entry, const gbtree_ninfo *tinfo);
 
 extern GISTENTRY *gbt_num_fetch(GISTENTRY *entry, const gbtree_ninfo *tinfo);
diff --git a/contrib/btree_gist/meson.build b/contrib/btree_gist/meson.build
index 2b1a5463289..c640cb7d8d2 100644
--- a/contrib/btree_gist/meson.build
+++ b/contrib/btree_gist/meson.build
@@ -49,12 +49,13 @@ install_data(
   'btree_gist--1.4--1.5.sql',
   'btree_gist--1.5--1.6.sql',
   'btree_gist--1.6--1.7.sql',
   'btree_gist--1.7--1.8.sql',
   'btree_gist--1.8--1.9.sql',
   'btree_gist--1.9.sql',
+  'btree_gist--1.9--1.10.sql',
   kwargs: contrib_data_args,
 )
 
 tests += {
   'name': 'btree_gist',
   'sd': meson.current_source_dir(),
@@ -89,10 +90,11 @@ tests += {
       'uuid',
       'not_equal',
       'enum',
       'bool',
       'partitions',
       'stratnum',
+      'int_crosstype',
       'without_overlaps',
     ],
   },
 }
-- 
2.51.0



  [application/octet-stream] 0002-Add-tests-for-cross-type-operators-for-GiST-indexes.patch (23.9K, 4-0002-Add-tests-for-cross-type-operators-for-GiST-indexes.patch)
  download | inline diff:
From e6204108239a2458be11aa2286edd7bd029a3681 Mon Sep 17 00:00:00 2001
From: Alexander Nestorov <[email protected]>
Date: Thu, 4 Jun 2026 00:08:27 +0200
Subject: [PATCH] Add tests for cross-type operators for GiST indexes

---
 contrib/btree_gist/expected/int_crosstype.out | 568 ++++++++++++++++++
 contrib/btree_gist/sql/int_crosstype.sql      | 236 ++++++++
 2 files changed, 804 insertions(+)
 create mode 100644 contrib/btree_gist/expected/int_crosstype.out
 create mode 100644 contrib/btree_gist/sql/int_crosstype.sql

diff --git a/contrib/btree_gist/expected/int_crosstype.out b/contrib/btree_gist/expected/int_crosstype.out
new file mode 100644
index 00000000000..bd2b90bd3f6
--- /dev/null
+++ b/contrib/btree_gist/expected/int_crosstype.out
@@ -0,0 +1,568 @@
+-- Cross-type operator support for int2/int4/int8 in GiST.
+--
+-- Verifies that (a) the cross-type B-tree-style operators registered in
+-- gist_int{2,4,8}_ops match the results of seqscans using the same operator
+-- expressions, (b) the KNN <-> operator works across types and uses the
+-- index, and (c) values outside the smaller subtype's range are handled
+-- according to normal comparison semantics, without narrowing or erroring.
+--
+-- Catalog invariant: the cross-type pg_amop entries in gist_int{2,4,8}_ops must
+-- agree, in both directions, with what the C cross-type dispatch can handle.
+-- The set of supported query subtypes is read from the C side via
+-- gbt_int_crosstype_subtypes(), so there is no hand-maintained second copy of
+-- the list here: registering a pg_amop row whose subtype the dispatch does not
+-- handle (or dropping a dispatch entry while its pg_amop rows remain, or vice
+-- versa) shows up as a diff below. Cross-type pg_amproc rows must also stay
+-- absent, since the dispatch reuses the same-type support functions.
+--
+WITH dispatch_subtypes(typ) AS (
+    SELECT s.subtype::regtype
+    FROM gbt_int_crosstype_subtypes() AS s(subtype)
+),
+gist_int_opclasses(opfamily, indextype) AS (
+    SELECT opc.opcname::text, opc.opcintype::regtype
+    FROM pg_opclass opc
+         JOIN pg_am am ON am.oid = opc.opcmethod
+    WHERE am.amname = 'gist'
+      AND opc.opcname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+),
+expected_pairs(opfamily, lefttype, righttype) AS (
+    SELECT oc.opfamily, oc.indextype, ds.typ
+    FROM gist_int_opclasses oc
+         CROSS JOIN dispatch_subtypes ds
+    WHERE ds.typ <> oc.indextype
+),
+expected_amop(opfamily, lefttype, righttype, strategy, purpose) AS (
+    SELECT opfamily, lefttype, righttype, strategy, purpose
+    FROM expected_pairs
+         CROSS JOIN (VALUES
+             (1, 's'), (2, 's'), (3, 's'), (4, 's'),
+             (5, 's'), (6, 's'), (15, 'o')
+         ) AS strategy_purposes(strategy, purpose)
+),
+actual_amop AS (
+    SELECT opf.opfname::text AS opfamily,
+           amop.amoplefttype::regtype AS lefttype,
+           amop.amoprighttype::regtype AS righttype,
+           amop.amopstrategy::int AS strategy,
+           amop.amoppurpose::text AS purpose
+    FROM pg_amop amop
+         JOIN pg_opfamily opf ON opf.oid = amop.amopfamily
+         JOIN pg_am am ON am.oid = opf.opfmethod
+    WHERE am.amname = 'gist'
+      AND opf.opfname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+      AND amop.amoplefttype <> amop.amoprighttype
+)
+SELECT *
+FROM (
+    SELECT 'missing from pg_amop' AS status, *
+    FROM (SELECT * FROM expected_amop EXCEPT SELECT * FROM actual_amop) missing
+    UNION ALL
+    SELECT 'unexpected in pg_amop' AS status, *
+    FROM (SELECT * FROM actual_amop EXCEPT SELECT * FROM expected_amop) unexpected
+) diff
+ORDER BY status, opfamily, lefttype::text, righttype::text, strategy, purpose;
+ status | opfamily | lefttype | righttype | strategy | purpose 
+--------+----------+----------+-----------+----------+---------
+(0 rows)
+
+SELECT opf.opfname AS opfamily,
+       amproc.amproclefttype::regtype AS lefttype,
+       amproc.amprocrighttype::regtype AS righttype,
+       amproc.amprocnum AS procnum,
+       amproc.amproc::regproc AS proc
+FROM pg_amproc amproc
+     JOIN pg_opfamily opf ON opf.oid = amproc.amprocfamily
+     JOIN pg_am am ON am.oid = opf.opfmethod
+WHERE am.amname = 'gist'
+  AND opf.opfname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+  AND amproc.amproclefttype <> amproc.amprocrighttype
+ORDER BY opf.opfname,
+         amproc.amproclefttype::regtype::text,
+         amproc.amprocrighttype::regtype::text,
+         amproc.amprocnum,
+         amproc.amproc::regproc::text;
+ opfamily | lefttype | righttype | procnum | proc 
+----------+----------+-----------+---------+------
+(0 rows)
+
+CREATE TABLE ct_i2 (a int2);
+CREATE TABLE ct_i4 (a int4);
+CREATE TABLE ct_i8 (a int8);
+INSERT INTO ct_i2 SELECT g::int2 FROM generate_series(-100, 100) g;
+INSERT INTO ct_i4 SELECT g FROM generate_series(-100, 100) g;
+INSERT INTO ct_i8 SELECT g::int8 FROM generate_series(-100, 100) g;
+-- Add some values that are representable only in wider types, to exercise
+-- the path where the cross-type query constant is out of range of the key
+-- type.
+INSERT INTO ct_i4 VALUES (100000), (-100000);
+INSERT INTO ct_i8 VALUES (5000000000), (-5000000000);
+CREATE INDEX ct_i2_idx ON ct_i2 USING gist (a);
+CREATE INDEX ct_i4_idx ON ct_i4 USING gist (a);
+CREATE INDEX ct_i8_idx ON ct_i8 USING gist (a);
+ANALYZE ct_i2;
+ANALYZE ct_i4;
+ANALYZE ct_i8;
+SET enable_seqscan = off;
+SET enable_bitmapscan = off;
+-- int2 key x int4 query
+SELECT count(*) FROM ct_i2 WHERE a <  50::int4;
+ count 
+-------
+   150
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <= 50::int4;
+ count 
+-------
+   151
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a =  50::int4;
+ count 
+-------
+     1
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a >= 50::int4;
+ count 
+-------
+    51
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a >  50::int4;
+ count 
+-------
+    50
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <> 50::int4;
+ count 
+-------
+   200
+(1 row)
+
+-- query out of int2 range: matches nothing for =, everything for <>
+SELECT count(*) FROM ct_i2 WHERE a =  100000::int4;
+ count 
+-------
+     0
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <> 100000::int4;
+ count 
+-------
+   201
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <  100000::int4;
+ count 
+-------
+   201
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a >  100000::int4;
+ count 
+-------
+     0
+(1 row)
+
+-- int2 key x int8 query
+SELECT count(*) FROM ct_i2 WHERE a <  50::int8;
+ count 
+-------
+   150
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <= 50::int8;
+ count 
+-------
+   151
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a =  50::int8;
+ count 
+-------
+     1
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a >= 50::int8;
+ count 
+-------
+    51
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a >  50::int8;
+ count 
+-------
+    50
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <> 50::int8;
+ count 
+-------
+   200
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a =  5000000000::int8;
+ count 
+-------
+     0
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <  5000000000::int8;
+ count 
+-------
+   201
+(1 row)
+
+-- int4 key x int2 query
+SELECT count(*) FROM ct_i4 WHERE a <  50::int2;
+ count 
+-------
+   151
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <= 50::int2;
+ count 
+-------
+   152
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a =  50::int2;
+ count 
+-------
+     1
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a >= 50::int2;
+ count 
+-------
+    52
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a >  50::int2;
+ count 
+-------
+    51
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <> 50::int2;
+ count 
+-------
+   202
+(1 row)
+
+-- int4 key x int8 query
+SELECT count(*) FROM ct_i4 WHERE a <  50::int8;
+ count 
+-------
+   151
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <= 50::int8;
+ count 
+-------
+   152
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a =  50::int8;
+ count 
+-------
+     1
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a >= 50::int8;
+ count 
+-------
+    52
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a >  50::int8;
+ count 
+-------
+    51
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <> 50::int8;
+ count 
+-------
+   202
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a =  5000000000::int8;
+ count 
+-------
+     0
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <  5000000000::int8;
+ count 
+-------
+   203
+(1 row)
+
+-- int8 key x int2 query
+SELECT count(*) FROM ct_i8 WHERE a <  50::int2;
+ count 
+-------
+   151
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a <= 50::int2;
+ count 
+-------
+   152
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a =  50::int2;
+ count 
+-------
+     1
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a >= 50::int2;
+ count 
+-------
+    52
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a >  50::int2;
+ count 
+-------
+    51
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a <> 50::int2;
+ count 
+-------
+   202
+(1 row)
+
+-- int8 key x int4 query
+SELECT count(*) FROM ct_i8 WHERE a <  50::int4;
+ count 
+-------
+   151
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a <= 50::int4;
+ count 
+-------
+   152
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a =  50::int4;
+ count 
+-------
+     1
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a >= 50::int4;
+ count 
+-------
+    52
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a >  50::int4;
+ count 
+-------
+    51
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a <> 50::int4;
+ count 
+-------
+   202
+(1 row)
+
+-- Confirm the index is actually used for a cross-type predicate.
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_i4 WHERE a = 50::int8;
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Index Only Scan using ct_i4_idx on ct_i4
+         Index Cond: (a = '50'::bigint)
+(3 rows)
+
+-- Cross-type KNN: int4 key ordered by int2 / int8 queries.
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int2 LIMIT 3;
+                   QUERY PLAN                   
+------------------------------------------------
+ Limit
+   ->  Index Only Scan using ct_i4_idx on ct_i4
+         Order By: (a <-> '-100'::smallint)
+(3 rows)
+
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int2 LIMIT 3;
+  a   
+------
+ -100
+  -99
+  -98
+(3 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int8 LIMIT 3;
+                   QUERY PLAN                   
+------------------------------------------------
+ Limit
+   ->  Index Only Scan using ct_i4_idx on ct_i4
+         Order By: (a <-> '-100'::bigint)
+(3 rows)
+
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int8 LIMIT 3;
+  a   
+------
+ -100
+  -99
+  -98
+(3 rows)
+
+-- Cross-type KNN: int2 key ordered by int4 / int8 queries.
+SELECT a FROM ct_i2 ORDER BY a <-> '-100'::int4 LIMIT 3;
+  a   
+------
+ -100
+  -99
+  -98
+(3 rows)
+
+SELECT a FROM ct_i2 ORDER BY a <-> '-100'::int8 LIMIT 3;
+  a   
+------
+ -100
+  -99
+  -98
+(3 rows)
+
+-- Cross-type KNN: int8 key ordered by int2 / int4 queries.
+SELECT a FROM ct_i8 ORDER BY a <-> '-100'::int2 LIMIT 3;
+  a   
+------
+ -100
+  -99
+  -98
+(3 rows)
+
+SELECT a FROM ct_i8 ORDER BY a <-> '-100'::int4 LIMIT 3;
+  a   
+------
+ -100
+  -99
+  -98
+(3 rows)
+
+-- Combined: cross-type WHERE + cross-type ORDER BY on the same index.
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 WHERE a < 80::int8 ORDER BY a <-> '-100'::int8 LIMIT 3;
+                   QUERY PLAN                   
+------------------------------------------------
+ Limit
+   ->  Index Only Scan using ct_i4_idx on ct_i4
+         Index Cond: (a < '80'::bigint)
+         Order By: (a <-> '-100'::bigint)
+(4 rows)
+
+SELECT a FROM ct_i4 WHERE a < 80::int8 ORDER BY a <-> '-100'::int8 LIMIT 3;
+  a   
+------
+ -100
+  -99
+  -98
+(3 rows)
+
+-- Standalone distance-function smoke tests (not going through the index),
+-- including the overflow-detection paths.
+SELECT int2_int4_dist(3::int2, 10::int4);
+ int2_int4_dist 
+----------------
+              7
+(1 row)
+
+SELECT int4_int2_dist(-5::int4, 5::int2);
+ int4_int2_dist 
+----------------
+             10
+(1 row)
+
+SELECT int2_int8_dist(3::int2, 10::int8);
+ int2_int8_dist 
+----------------
+              7
+(1 row)
+
+SELECT int8_int2_dist(100::int8, -5::int2);
+ int8_int2_dist 
+----------------
+            105
+(1 row)
+
+SELECT int4_int8_dist(100::int4, 5000000000::int8);
+ int4_int8_dist 
+----------------
+     4999999900
+(1 row)
+
+SELECT int8_int4_dist(5000000000::int8, 100::int4);
+ int8_int4_dist 
+----------------
+     4999999900
+(1 row)
+
+-- Overflow detection: INT32_MIN distance from a positive int2 can't fit
+-- in int32, should error.
+SELECT int2_int4_dist(1::int2, -2147483648::int4);
+ERROR:  integer out of range
+-- Likewise INT64_MIN distance from a positive int4 can't fit in int64.
+SELECT int4_int8_dist(1::int4, -9223372036854775808::int8);
+ERROR:  bigint out of range
+--
+-- Multi-column GiST index with mixed-type predicates. This is the
+-- original motivating case: without cross-type operator support the
+-- planner can only use one column as an Index Cond and applies the
+-- other(s) as a Filter post-scan. Here both columns should appear as
+-- Index Cond.
+--
+CREATE TABLE ct_multi (a int4, b int8);
+INSERT INTO ct_multi
+    SELECT g, (g * 2)::int8 FROM generate_series(-50, 50) g;
+CREATE INDEX ct_multi_idx ON ct_multi USING gist (a, b);
+ANALYZE ct_multi;
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_multi WHERE a = 25::int8 AND b = 50::int4;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Index Only Scan using ct_multi_idx on ct_multi
+         Index Cond: ((a = '25'::bigint) AND (b = 50))
+(3 rows)
+
+SELECT count(*) FROM ct_multi WHERE a = 25::int8 AND b = 50::int4;
+ count 
+-------
+     1
+(1 row)
+
+-- Mixed cross-type ranges across both columns.
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_multi WHERE a < 10::int8 AND b > 0::int2;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Aggregate
+   ->  Index Only Scan using ct_multi_idx on ct_multi
+         Index Cond: ((a < '10'::bigint) AND (b > '0'::smallint))
+(3 rows)
+
+SELECT count(*) FROM ct_multi WHERE a < 10::int8 AND b > 0::int2;
+ count 
+-------
+     9
+(1 row)
+
+DROP TABLE ct_multi;
+DROP TABLE ct_i2;
+DROP TABLE ct_i4;
+DROP TABLE ct_i8;
diff --git a/contrib/btree_gist/sql/int_crosstype.sql b/contrib/btree_gist/sql/int_crosstype.sql
new file mode 100644
index 00000000000..d38084f1941
--- /dev/null
+++ b/contrib/btree_gist/sql/int_crosstype.sql
@@ -0,0 +1,236 @@
+-- Cross-type operator support for int2/int4/int8 in GiST.
+--
+-- Verifies that (a) the cross-type B-tree-style operators registered in
+-- gist_int{2,4,8}_ops match the results of seqscans using the same operator
+-- expressions, (b) the KNN <-> operator works across types and uses the
+-- index, and (c) values outside the smaller subtype's range are handled
+-- according to normal comparison semantics, without narrowing or erroring.
+
+--
+-- Catalog invariant: the cross-type pg_amop entries in gist_int{2,4,8}_ops must
+-- agree, in both directions, with what the C cross-type dispatch can handle.
+-- The set of supported query subtypes is read from the C side via
+-- gbt_int_crosstype_subtypes(), so there is no hand-maintained second copy of
+-- the list here: registering a pg_amop row whose subtype the dispatch does not
+-- handle (or dropping a dispatch entry while its pg_amop rows remain, or vice
+-- versa) shows up as a diff below. Cross-type pg_amproc rows must also stay
+-- absent, since the dispatch reuses the same-type support functions.
+--
+WITH dispatch_subtypes(typ) AS (
+    SELECT s.subtype::regtype
+    FROM gbt_int_crosstype_subtypes() AS s(subtype)
+),
+gist_int_opclasses(opfamily, indextype) AS (
+    SELECT opc.opcname::text, opc.opcintype::regtype
+    FROM pg_opclass opc
+         JOIN pg_am am ON am.oid = opc.opcmethod
+    WHERE am.amname = 'gist'
+      AND opc.opcname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+),
+expected_pairs(opfamily, lefttype, righttype) AS (
+    SELECT oc.opfamily, oc.indextype, ds.typ
+    FROM gist_int_opclasses oc
+         CROSS JOIN dispatch_subtypes ds
+    WHERE ds.typ <> oc.indextype
+),
+expected_amop(opfamily, lefttype, righttype, strategy, purpose) AS (
+    SELECT opfamily, lefttype, righttype, strategy, purpose
+    FROM expected_pairs
+         CROSS JOIN (VALUES
+             (1, 's'), (2, 's'), (3, 's'), (4, 's'),
+             (5, 's'), (6, 's'), (15, 'o')
+         ) AS strategy_purposes(strategy, purpose)
+),
+actual_amop AS (
+    SELECT opf.opfname::text AS opfamily,
+           amop.amoplefttype::regtype AS lefttype,
+           amop.amoprighttype::regtype AS righttype,
+           amop.amopstrategy::int AS strategy,
+           amop.amoppurpose::text AS purpose
+    FROM pg_amop amop
+         JOIN pg_opfamily opf ON opf.oid = amop.amopfamily
+         JOIN pg_am am ON am.oid = opf.opfmethod
+    WHERE am.amname = 'gist'
+      AND opf.opfname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+      AND amop.amoplefttype <> amop.amoprighttype
+)
+SELECT *
+FROM (
+    SELECT 'missing from pg_amop' AS status, *
+    FROM (SELECT * FROM expected_amop EXCEPT SELECT * FROM actual_amop) missing
+    UNION ALL
+    SELECT 'unexpected in pg_amop' AS status, *
+    FROM (SELECT * FROM actual_amop EXCEPT SELECT * FROM expected_amop) unexpected
+) diff
+ORDER BY status, opfamily, lefttype::text, righttype::text, strategy, purpose;
+
+SELECT opf.opfname AS opfamily,
+       amproc.amproclefttype::regtype AS lefttype,
+       amproc.amprocrighttype::regtype AS righttype,
+       amproc.amprocnum AS procnum,
+       amproc.amproc::regproc AS proc
+FROM pg_amproc amproc
+     JOIN pg_opfamily opf ON opf.oid = amproc.amprocfamily
+     JOIN pg_am am ON am.oid = opf.opfmethod
+WHERE am.amname = 'gist'
+  AND opf.opfname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+  AND amproc.amproclefttype <> amproc.amprocrighttype
+ORDER BY opf.opfname,
+         amproc.amproclefttype::regtype::text,
+         amproc.amprocrighttype::regtype::text,
+         amproc.amprocnum,
+         amproc.amproc::regproc::text;
+
+CREATE TABLE ct_i2 (a int2);
+CREATE TABLE ct_i4 (a int4);
+CREATE TABLE ct_i8 (a int8);
+
+INSERT INTO ct_i2 SELECT g::int2 FROM generate_series(-100, 100) g;
+INSERT INTO ct_i4 SELECT g FROM generate_series(-100, 100) g;
+INSERT INTO ct_i8 SELECT g::int8 FROM generate_series(-100, 100) g;
+
+-- Add some values that are representable only in wider types, to exercise
+-- the path where the cross-type query constant is out of range of the key
+-- type.
+INSERT INTO ct_i4 VALUES (100000), (-100000);
+INSERT INTO ct_i8 VALUES (5000000000), (-5000000000);
+
+CREATE INDEX ct_i2_idx ON ct_i2 USING gist (a);
+CREATE INDEX ct_i4_idx ON ct_i4 USING gist (a);
+CREATE INDEX ct_i8_idx ON ct_i8 USING gist (a);
+
+ANALYZE ct_i2;
+ANALYZE ct_i4;
+ANALYZE ct_i8;
+
+SET enable_seqscan = off;
+SET enable_bitmapscan = off;
+
+-- int2 key x int4 query
+SELECT count(*) FROM ct_i2 WHERE a <  50::int4;
+SELECT count(*) FROM ct_i2 WHERE a <= 50::int4;
+SELECT count(*) FROM ct_i2 WHERE a =  50::int4;
+SELECT count(*) FROM ct_i2 WHERE a >= 50::int4;
+SELECT count(*) FROM ct_i2 WHERE a >  50::int4;
+SELECT count(*) FROM ct_i2 WHERE a <> 50::int4;
+
+-- query out of int2 range: matches nothing for =, everything for <>
+SELECT count(*) FROM ct_i2 WHERE a =  100000::int4;
+SELECT count(*) FROM ct_i2 WHERE a <> 100000::int4;
+SELECT count(*) FROM ct_i2 WHERE a <  100000::int4;
+SELECT count(*) FROM ct_i2 WHERE a >  100000::int4;
+
+-- int2 key x int8 query
+SELECT count(*) FROM ct_i2 WHERE a <  50::int8;
+SELECT count(*) FROM ct_i2 WHERE a <= 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a =  50::int8;
+SELECT count(*) FROM ct_i2 WHERE a >= 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a >  50::int8;
+SELECT count(*) FROM ct_i2 WHERE a <> 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a =  5000000000::int8;
+SELECT count(*) FROM ct_i2 WHERE a <  5000000000::int8;
+
+-- int4 key x int2 query
+SELECT count(*) FROM ct_i4 WHERE a <  50::int2;
+SELECT count(*) FROM ct_i4 WHERE a <= 50::int2;
+SELECT count(*) FROM ct_i4 WHERE a =  50::int2;
+SELECT count(*) FROM ct_i4 WHERE a >= 50::int2;
+SELECT count(*) FROM ct_i4 WHERE a >  50::int2;
+SELECT count(*) FROM ct_i4 WHERE a <> 50::int2;
+
+-- int4 key x int8 query
+SELECT count(*) FROM ct_i4 WHERE a <  50::int8;
+SELECT count(*) FROM ct_i4 WHERE a <= 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a =  50::int8;
+SELECT count(*) FROM ct_i4 WHERE a >= 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a >  50::int8;
+SELECT count(*) FROM ct_i4 WHERE a <> 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a =  5000000000::int8;
+SELECT count(*) FROM ct_i4 WHERE a <  5000000000::int8;
+
+-- int8 key x int2 query
+SELECT count(*) FROM ct_i8 WHERE a <  50::int2;
+SELECT count(*) FROM ct_i8 WHERE a <= 50::int2;
+SELECT count(*) FROM ct_i8 WHERE a =  50::int2;
+SELECT count(*) FROM ct_i8 WHERE a >= 50::int2;
+SELECT count(*) FROM ct_i8 WHERE a >  50::int2;
+SELECT count(*) FROM ct_i8 WHERE a <> 50::int2;
+
+-- int8 key x int4 query
+SELECT count(*) FROM ct_i8 WHERE a <  50::int4;
+SELECT count(*) FROM ct_i8 WHERE a <= 50::int4;
+SELECT count(*) FROM ct_i8 WHERE a =  50::int4;
+SELECT count(*) FROM ct_i8 WHERE a >= 50::int4;
+SELECT count(*) FROM ct_i8 WHERE a >  50::int4;
+SELECT count(*) FROM ct_i8 WHERE a <> 50::int4;
+
+-- Confirm the index is actually used for a cross-type predicate.
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_i4 WHERE a = 50::int8;
+
+-- Cross-type KNN: int4 key ordered by int2 / int8 queries.
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int2 LIMIT 3;
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int2 LIMIT 3;
+
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int8 LIMIT 3;
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int8 LIMIT 3;
+
+-- Cross-type KNN: int2 key ordered by int4 / int8 queries.
+SELECT a FROM ct_i2 ORDER BY a <-> '-100'::int4 LIMIT 3;
+SELECT a FROM ct_i2 ORDER BY a <-> '-100'::int8 LIMIT 3;
+
+-- Cross-type KNN: int8 key ordered by int2 / int4 queries.
+SELECT a FROM ct_i8 ORDER BY a <-> '-100'::int2 LIMIT 3;
+SELECT a FROM ct_i8 ORDER BY a <-> '-100'::int4 LIMIT 3;
+
+-- Combined: cross-type WHERE + cross-type ORDER BY on the same index.
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 WHERE a < 80::int8 ORDER BY a <-> '-100'::int8 LIMIT 3;
+SELECT a FROM ct_i4 WHERE a < 80::int8 ORDER BY a <-> '-100'::int8 LIMIT 3;
+
+-- Standalone distance-function smoke tests (not going through the index),
+-- including the overflow-detection paths.
+SELECT int2_int4_dist(3::int2, 10::int4);
+SELECT int4_int2_dist(-5::int4, 5::int2);
+SELECT int2_int8_dist(3::int2, 10::int8);
+SELECT int8_int2_dist(100::int8, -5::int2);
+SELECT int4_int8_dist(100::int4, 5000000000::int8);
+SELECT int8_int4_dist(5000000000::int8, 100::int4);
+
+-- Overflow detection: INT32_MIN distance from a positive int2 can't fit
+-- in int32, should error.
+SELECT int2_int4_dist(1::int2, -2147483648::int4);
+-- Likewise INT64_MIN distance from a positive int4 can't fit in int64.
+SELECT int4_int8_dist(1::int4, -9223372036854775808::int8);
+
+--
+-- Multi-column GiST index with mixed-type predicates. This is the
+-- original motivating case: without cross-type operator support the
+-- planner can only use one column as an Index Cond and applies the
+-- other(s) as a Filter post-scan. Here both columns should appear as
+-- Index Cond.
+--
+CREATE TABLE ct_multi (a int4, b int8);
+INSERT INTO ct_multi
+    SELECT g, (g * 2)::int8 FROM generate_series(-50, 50) g;
+CREATE INDEX ct_multi_idx ON ct_multi USING gist (a, b);
+ANALYZE ct_multi;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_multi WHERE a = 25::int8 AND b = 50::int4;
+
+SELECT count(*) FROM ct_multi WHERE a = 25::int8 AND b = 50::int4;
+
+-- Mixed cross-type ranges across both columns.
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_multi WHERE a < 10::int8 AND b > 0::int2;
+
+SELECT count(*) FROM ct_multi WHERE a < 10::int8 AND b > 0::int2;
+
+DROP TABLE ct_multi;
+
+DROP TABLE ct_i2;
+DROP TABLE ct_i4;
+DROP TABLE ct_i8;
-- 
2.51.0



view thread (9+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: [PATCH] btree_gist: add cross-type integer operator support for GiST
  In-Reply-To: <80ef3b41-1a71-47a4-a320-29e118d7092c@Spark>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox