public inbox for [email protected]
help / color / mirror / Atom feedFrom: Alexander Nestorov <[email protected]>
To: Andrey Borodin <[email protected]>
Cc: pgsql-hackers mailing list <[email protected]>
Subject: Re: [PATCH] btree_gist: add cross-type integer operator support for GiST
Date: Sun, 14 Jun 2026 01:35:14 +0200
Message-ID: <80ef3b41-1a71-47a4-a320-29e118d7092c@Spark> (raw)
In-Reply-To: <18e88767-6c31-402a-887e-37c38b366a6a@Spark>
References: <aac10ffa-a0ca-4c49-846b-3655cbc6b37e@Spark>
<36b4f67d-5975-452c-a6b8-b6407f0924ee@Spark>
<[email protected]>
<18e88767-6c31-402a-887e-37c38b366a6a@Spark>
Hello Andrey!
Following up with the things I owed you: the benchmarks, the consistency check
and adding a note for the 2^53 case.
I added a fast path. Each integer opclass's consistent() / distance() now
detects the "same type" case and calls the original gbt_num_consistent() /
gbt_num_distance() directly.
To confirm there's no regression I ran a microbenchmark on an -O2 build, no
asserts, single client, over a 500k row int4 GiST index, with the following
options:
-c enable_seqscan=off \
-c enable_bitmapscan=off \
-c enable_sort=off \
-c max_parallel_workers_per_gather=0
This is the base for the bench:
CREATE EXTENSION IF NOT EXISTS btree_gist;
DROP TABLE IF EXISTS benchg;
CREATE TABLE benchg (a int4);
INSERT INTO benchg SELECT g FROM generate_series(0, 499999) g;
CREATE INDEX benchg_idx ON benchg USING gist (a);
VACUUM (ANALYZE, FREEZE) benchg;
And the two workloads:
consistent(), full-range index-only count(*):
SELECT count(*) FROM benchg WHERE a >= 0 AND a <= 499999;
distance(), full KNN ordering (ORDER BY a<->k over all rows):
SELECT count(*) FROM (SELECT a FROM benchg ORDER BY a <-> 250000 LIMIT 1000000) q;
The numbers in ms (12 repetitions, 15s each) before
(3e3d7875e95621b02311ea3443e5139e3bce944a) and after my patch:
before consistent min/med/mean = 51.754 52.718 54.137 ms
after consistent min/med/mean = 52.042 52.480 52.572 ms
------------------------------------------------------------------------
before distance min/med/mean = 76.863 77.177 77.395 ms
after distance min/med/mean = 77.357 77.803 77.980 ms
All numbers seem to be within measurement noise, except the consistent-before,
which is probably inflated by one slow rep.
Regarding the other point, I explored the regression suite path I mentioned.
The consistent() / distance() functions dispatch cross-type queries through a
single static table of supported subtype OIDs (gbt_int_crosstype_table in
btree_utils_num.c). I expose that exact table to SQL, in gbt_int_crosstype_subtypes(),
so there is no hand-maintained second copy of the list.
The int_crosstype.sql regression test then builds the set of cross-type
(lefttype, righttype, strategy) entries that should exist in pg_amop from that
function, and EXCEPTs it against the cross-type rows actually present in
gist_int{2,4,8}_ops:
- a pg_amop row whose subtype the C dispatch does not handle shows up as
"unexpected in pg_amop", and
- a dispatch entry without the matching pg_amop rows shows up as
"missing from pg_amop".
Either kind of drift produces a diff under `make check`. So adding an ALTER
OPERATOR FAMILY entry without a matching dispatch entry (or vice versa) fails
the suite (as I mentioned in my previous email, I'm not aware of a way to do
this with amvalidate() without patching core).
I'm attaching the new set of patches (this time I include the tests).
Best regards!
Attachments:
[application/octet-stream] 0001-Implement-cross-type-operators-for-GiST-indexes.patch (26.4K, 3-0001-Implement-cross-type-operators-for-GiST-indexes.patch)
download | inline diff:
From 363b9c22874cdbb6cf780899c784c813bb10f94e Mon Sep 17 00:00:00 2001
From: Alexander Nestorov <[email protected]>
Date: Thu, 4 Jun 2026 00:06:54 +0200
Subject: [PATCH] Implement cross-type operators for GiST indexes
---
contrib/btree_gist/Makefile | 4 +-
contrib/btree_gist/btree_gist--1.9--1.10.sql | 143 +++++++++++++++++
contrib/btree_gist/btree_gist.control | 2 +-
contrib/btree_gist/btree_int2.c | 70 ++++++--
contrib/btree_gist/btree_int4.c | 70 ++++++--
contrib/btree_gist/btree_int8.c | 70 ++++++--
contrib/btree_gist/btree_utils_num.c | 160 +++++++++++++++++++
contrib/btree_gist/btree_utils_num.h | 16 ++
contrib/btree_gist/meson.build | 2 +
9 files changed, 501 insertions(+), 36 deletions(-)
create mode 100644 contrib/btree_gist/btree_gist--1.9--1.10.sql
diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
index fbbbca95598..1d0668d97ab 100644
--- a/contrib/btree_gist/Makefile
+++ b/contrib/btree_gist/Makefile
@@ -32,19 +32,19 @@ OBJS = \
EXTENSION = btree_gist
DATA = btree_gist--1.0--1.1.sql \
btree_gist--1.1--1.2.sql btree_gist--1.2--1.3.sql \
btree_gist--1.3--1.4.sql btree_gist--1.4--1.5.sql \
btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql \
btree_gist--1.7--1.8.sql btree_gist--1.8--1.9.sql \
- btree_gist--1.9.sql
+ btree_gist--1.9.sql btree_gist--1.9--1.10.sql
PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"
REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \
time timetz date interval macaddr macaddr8 inet cidr text varchar char \
bytea bit varbit numeric uuid not_equal enum bool partitions \
- stratnum without_overlaps
+ stratnum int_crosstype without_overlaps
SHLIB_LINK += $(filter -lm, $(LIBS))
ifdef USE_PGXS
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
diff --git a/contrib/btree_gist/btree_gist--1.9--1.10.sql b/contrib/btree_gist/btree_gist--1.9--1.10.sql
new file mode 100644
index 00000000000..c9ff1955204
--- /dev/null
+++ b/contrib/btree_gist/btree_gist--1.9--1.10.sql
@@ -0,0 +1,143 @@
+/* contrib/btree_gist/btree_gist--1.9--1.10.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.10'" to load this file. \quit
+
+-- Add cross-type operator support for the integer trio (int2, int4, int8)
+-- to the existing GiST operator families.
+--
+-- GiST's amvalidate requires support functions in a family to have matching
+-- left/right input types, so the catalog additions below are deliberately
+-- pg_amop-only. The existing consistent/distance support functions dispatch
+-- on the subtype OID: same-type queries take the normal path, while mixed-width
+-- integer queries are promoted to int64 and compared by the integer cross-type
+-- helpers in btree_utils_num.c.
+
+CREATE FUNCTION int2_int4_dist(int2, int4)
+RETURNS int4
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int4_int2_dist(int4, int2)
+RETURNS int4
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int2_int8_dist(int2, int8)
+RETURNS int8
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int8_int2_dist(int8, int2)
+RETURNS int8
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int4_int8_dist(int4, int8)
+RETURNS int8
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION int8_int4_dist(int8, int4)
+RETURNS int8
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+-- Introspection helper exposing the integer query subtypes understood by the C
+-- cross-type dispatch (gbt_int_crosstype_table in btree_utils_num.c). The
+-- regression tests use it to assert that the cross-type pg_amop entries below
+-- never drift from what the dispatch can handle.
+CREATE FUNCTION gbt_int_crosstype_subtypes()
+RETURNS SETOF oid
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE PARALLEL SAFE;
+
+CREATE OPERATOR <-> (
+ LEFTARG = int2,
+ RIGHTARG = int4,
+ PROCEDURE = int2_int4_dist,
+ COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+ LEFTARG = int4,
+ RIGHTARG = int2,
+ PROCEDURE = int4_int2_dist,
+ COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+ LEFTARG = int2,
+ RIGHTARG = int8,
+ PROCEDURE = int2_int8_dist,
+ COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+ LEFTARG = int8,
+ RIGHTARG = int2,
+ PROCEDURE = int8_int2_dist,
+ COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+ LEFTARG = int4,
+ RIGHTARG = int8,
+ PROCEDURE = int4_int8_dist,
+ COMMUTATOR = '<->'
+);
+
+CREATE OPERATOR <-> (
+ LEFTARG = int8,
+ RIGHTARG = int4,
+ PROCEDURE = int8_int4_dist,
+ COMMUTATOR = '<->'
+);
+
+ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD
+ OPERATOR 1 < (int2, int4),
+ OPERATOR 2 <= (int2, int4),
+ OPERATOR 3 = (int2, int4),
+ OPERATOR 4 >= (int2, int4),
+ OPERATOR 5 > (int2, int4),
+ OPERATOR 6 <> (int2, int4),
+ OPERATOR 15 <-> (int2, int4) FOR ORDER BY pg_catalog.integer_ops,
+ OPERATOR 1 < (int2, int8),
+ OPERATOR 2 <= (int2, int8),
+ OPERATOR 3 = (int2, int8),
+ OPERATOR 4 >= (int2, int8),
+ OPERATOR 5 > (int2, int8),
+ OPERATOR 6 <> (int2, int8),
+ OPERATOR 15 <-> (int2, int8) FOR ORDER BY pg_catalog.integer_ops;
+
+ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD
+ OPERATOR 1 < (int4, int2),
+ OPERATOR 2 <= (int4, int2),
+ OPERATOR 3 = (int4, int2),
+ OPERATOR 4 >= (int4, int2),
+ OPERATOR 5 > (int4, int2),
+ OPERATOR 6 <> (int4, int2),
+ OPERATOR 15 <-> (int4, int2) FOR ORDER BY pg_catalog.integer_ops,
+ OPERATOR 1 < (int4, int8),
+ OPERATOR 2 <= (int4, int8),
+ OPERATOR 3 = (int4, int8),
+ OPERATOR 4 >= (int4, int8),
+ OPERATOR 5 > (int4, int8),
+ OPERATOR 6 <> (int4, int8),
+ OPERATOR 15 <-> (int4, int8) FOR ORDER BY pg_catalog.integer_ops;
+
+ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD
+ OPERATOR 1 < (int8, int2),
+ OPERATOR 2 <= (int8, int2),
+ OPERATOR 3 = (int8, int2),
+ OPERATOR 4 >= (int8, int2),
+ OPERATOR 5 > (int8, int2),
+ OPERATOR 6 <> (int8, int2),
+ OPERATOR 15 <-> (int8, int2) FOR ORDER BY pg_catalog.integer_ops,
+ OPERATOR 1 < (int8, int4),
+ OPERATOR 2 <= (int8, int4),
+ OPERATOR 3 = (int8, int4),
+ OPERATOR 4 >= (int8, int4),
+ OPERATOR 5 > (int8, int4),
+ OPERATOR 6 <> (int8, int4),
+ OPERATOR 15 <-> (int8, int4) FOR ORDER BY pg_catalog.integer_ops;
diff --git a/contrib/btree_gist/btree_gist.control b/contrib/btree_gist/btree_gist.control
index 69d9341a0ad..e606fa6551d 100644
--- a/contrib/btree_gist/btree_gist.control
+++ b/contrib/btree_gist/btree_gist.control
@@ -1,6 +1,6 @@
# btree_gist extension
comment = 'support for indexing common datatypes in GiST'
-default_version = '1.9'
+default_version = '1.10'
module_pathname = '$libdir/btree_gist'
relocatable = true
trusted = true
diff --git a/contrib/btree_gist/btree_int2.c b/contrib/btree_gist/btree_int2.c
index cc4b33177e3..d00b24d9a57 100644
--- a/contrib/btree_gist/btree_int2.c
+++ b/contrib/btree_gist/btree_int2.c
@@ -2,12 +2,13 @@
* contrib/btree_gist/btree_int2.c
*/
#include "postgres.h"
#include "btree_gist.h"
#include "btree_utils_num.h"
+#include "catalog/pg_type.h"
#include "common/int.h"
#include "utils/rel.h"
#include "utils/sortsupport.h"
typedef struct int16key
{
@@ -73,13 +74,12 @@ gbt_int2key_cmp(const void *a, const void *b, FmgrInfo *flinfo)
static float8
gbt_int2_dist(const void *a, const void *b, FmgrInfo *flinfo)
{
return GET_FLOAT_DISTANCE(int16, a, b);
}
-
static const gbtree_ninfo tinfo =
{
gbt_t_int2,
sizeof(int16),
4, /* sizeof(gbtreekey4) */
gbt_int2gt,
@@ -109,12 +109,46 @@ int2_dist(PG_FUNCTION_ARGS)
ra = abs(r);
PG_RETURN_INT16(ra);
}
+PG_FUNCTION_INFO_V1(int2_int4_dist);
+Datum
+int2_int4_dist(PG_FUNCTION_ARGS)
+{
+ int32 a = (int32) PG_GETARG_INT16(0);
+ int32 b = PG_GETARG_INT32(1);
+ int32 r;
+
+ if (pg_sub_s32_overflow(a, b, &r) ||
+ r == PG_INT32_MIN)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("integer out of range")));
+
+ PG_RETURN_INT32(abs(r));
+}
+
+PG_FUNCTION_INFO_V1(int2_int8_dist);
+Datum
+int2_int8_dist(PG_FUNCTION_ARGS)
+{
+ int64 a = (int64) PG_GETARG_INT16(0);
+ int64 b = PG_GETARG_INT64(1);
+ int64 r;
+
+ if (pg_sub_s64_overflow(a, b, &r) ||
+ r == PG_INT64_MIN)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ PG_RETURN_INT64(i64abs(r));
+}
+
/**************************************************
* GiST support functions
**************************************************/
Datum
@@ -134,47 +168,61 @@ gbt_int2_fetch(PG_FUNCTION_ARGS)
}
Datum
gbt_int2_consistent(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- int16 query = PG_GETARG_INT16(1);
+ Datum queryDatum = PG_GETARG_DATUM(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
-#ifdef NOT_USED
Oid subtype = PG_GETARG_OID(3);
-#endif
bool *recheck = (bool *) PG_GETARG_POINTER(4);
int16KEY *kkk = (int16KEY *) DatumGetPointer(entry->key);
+ int16 query;
GBT_NUMKEY_R key;
/* All cases served by this function are exact */
*recheck = false;
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
- PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
- GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
+ if (likely(subtype == InvalidOid || subtype == INT2OID))
+ {
+ query = DatumGetInt16(queryDatum);
+ PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
+ GIST_LEAF(entry), &tinfo,
+ fcinfo->flinfo));
+ }
+
+ PG_RETURN_BOOL(gbt_int_consistent_x((int64) kkk->lower, (int64) kkk->upper,
+ queryDatum, subtype, &strategy,
+ GIST_LEAF(entry)));
}
Datum
gbt_int2_distance(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- int16 query = PG_GETARG_INT16(1);
-#ifdef NOT_USED
+ Datum queryDatum = PG_GETARG_DATUM(1);
Oid subtype = PG_GETARG_OID(3);
-#endif
int16KEY *kkk = (int16KEY *) DatumGetPointer(entry->key);
+ int16 query;
GBT_NUMKEY_R key;
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
- PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
- &tinfo, fcinfo->flinfo));
+ if (likely(subtype == InvalidOid || subtype == INT2OID))
+ {
+ query = DatumGetInt16(queryDatum);
+ PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
+ &tinfo, fcinfo->flinfo));
+ }
+
+ PG_RETURN_FLOAT8(gbt_int_distance_x((int64) kkk->lower, (int64) kkk->upper,
+ queryDatum, subtype));
}
Datum
gbt_int2_union(PG_FUNCTION_ARGS)
{
GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
diff --git a/contrib/btree_gist/btree_int4.c b/contrib/btree_gist/btree_int4.c
index 47790578e6b..4df91f2057c 100644
--- a/contrib/btree_gist/btree_int4.c
+++ b/contrib/btree_gist/btree_int4.c
@@ -1,12 +1,13 @@
/*
* contrib/btree_gist/btree_int4.c
*/
#include "postgres.h"
#include "btree_gist.h"
#include "btree_utils_num.h"
+#include "catalog/pg_type.h"
#include "common/int.h"
#include "utils/rel.h"
#include "utils/sortsupport.h"
typedef struct int32key
{
@@ -71,13 +72,12 @@ gbt_int4key_cmp(const void *a, const void *b, FmgrInfo *flinfo)
static float8
gbt_int4_dist(const void *a, const void *b, FmgrInfo *flinfo)
{
return GET_FLOAT_DISTANCE(int32, a, b);
}
-
static const gbtree_ninfo tinfo =
{
gbt_t_int4,
sizeof(int32),
8, /* sizeof(gbtreekey8) */
gbt_int4gt,
@@ -107,12 +107,46 @@ int4_dist(PG_FUNCTION_ARGS)
ra = abs(r);
PG_RETURN_INT32(ra);
}
+PG_FUNCTION_INFO_V1(int4_int2_dist);
+Datum
+int4_int2_dist(PG_FUNCTION_ARGS)
+{
+ int32 a = PG_GETARG_INT32(0);
+ int32 b = (int32) PG_GETARG_INT16(1);
+ int32 r;
+
+ if (pg_sub_s32_overflow(a, b, &r) ||
+ r == PG_INT32_MIN)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("integer out of range")));
+
+ PG_RETURN_INT32(abs(r));
+}
+
+PG_FUNCTION_INFO_V1(int4_int8_dist);
+Datum
+int4_int8_dist(PG_FUNCTION_ARGS)
+{
+ int64 a = (int64) PG_GETARG_INT32(0);
+ int64 b = PG_GETARG_INT64(1);
+ int64 r;
+
+ if (pg_sub_s64_overflow(a, b, &r) ||
+ r == PG_INT64_MIN)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ PG_RETURN_INT64(i64abs(r));
+}
+
/**************************************************
* GiST support functions
**************************************************/
Datum
@@ -132,47 +166,61 @@ gbt_int4_fetch(PG_FUNCTION_ARGS)
}
Datum
gbt_int4_consistent(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- int32 query = PG_GETARG_INT32(1);
+ Datum queryDatum = PG_GETARG_DATUM(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
-#ifdef NOT_USED
Oid subtype = PG_GETARG_OID(3);
-#endif
bool *recheck = (bool *) PG_GETARG_POINTER(4);
int32KEY *kkk = (int32KEY *) DatumGetPointer(entry->key);
+ int32 query;
GBT_NUMKEY_R key;
/* All cases served by this function are exact */
*recheck = false;
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
- PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
- GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
+ if (likely(subtype == InvalidOid || subtype == INT4OID))
+ {
+ query = DatumGetInt32(queryDatum);
+ PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
+ GIST_LEAF(entry), &tinfo,
+ fcinfo->flinfo));
+ }
+
+ PG_RETURN_BOOL(gbt_int_consistent_x((int64) kkk->lower, (int64) kkk->upper,
+ queryDatum, subtype, &strategy,
+ GIST_LEAF(entry)));
}
Datum
gbt_int4_distance(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- int32 query = PG_GETARG_INT32(1);
-#ifdef NOT_USED
+ Datum queryDatum = PG_GETARG_DATUM(1);
Oid subtype = PG_GETARG_OID(3);
-#endif
int32KEY *kkk = (int32KEY *) DatumGetPointer(entry->key);
+ int32 query;
GBT_NUMKEY_R key;
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
- PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
- &tinfo, fcinfo->flinfo));
+ if (likely(subtype == InvalidOid || subtype == INT4OID))
+ {
+ query = DatumGetInt32(queryDatum);
+ PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
+ &tinfo, fcinfo->flinfo));
+ }
+
+ PG_RETURN_FLOAT8(gbt_int_distance_x((int64) kkk->lower, (int64) kkk->upper,
+ queryDatum, subtype));
}
Datum
gbt_int4_union(PG_FUNCTION_ARGS)
{
GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
diff --git a/contrib/btree_gist/btree_int8.c b/contrib/btree_gist/btree_int8.c
index f48122c8d84..ae01273b709 100644
--- a/contrib/btree_gist/btree_int8.c
+++ b/contrib/btree_gist/btree_int8.c
@@ -2,12 +2,13 @@
* contrib/btree_gist/btree_int8.c
*/
#include "postgres.h"
#include "btree_gist.h"
#include "btree_utils_num.h"
+#include "catalog/pg_type.h"
#include "common/int.h"
#include "utils/rel.h"
#include "utils/sortsupport.h"
typedef struct int64key
{
@@ -73,13 +74,12 @@ gbt_int8key_cmp(const void *a, const void *b, FmgrInfo *flinfo)
static float8
gbt_int8_dist(const void *a, const void *b, FmgrInfo *flinfo)
{
return GET_FLOAT_DISTANCE(int64, a, b);
}
-
static const gbtree_ninfo tinfo =
{
gbt_t_int8,
sizeof(int64),
16, /* sizeof(gbtreekey16) */
gbt_int8gt,
@@ -109,12 +109,46 @@ int8_dist(PG_FUNCTION_ARGS)
ra = i64abs(r);
PG_RETURN_INT64(ra);
}
+PG_FUNCTION_INFO_V1(int8_int2_dist);
+Datum
+int8_int2_dist(PG_FUNCTION_ARGS)
+{
+ int64 a = PG_GETARG_INT64(0);
+ int64 b = (int64) PG_GETARG_INT16(1);
+ int64 r;
+
+ if (pg_sub_s64_overflow(a, b, &r) ||
+ r == PG_INT64_MIN)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ PG_RETURN_INT64(i64abs(r));
+}
+
+PG_FUNCTION_INFO_V1(int8_int4_dist);
+Datum
+int8_int4_dist(PG_FUNCTION_ARGS)
+{
+ int64 a = PG_GETARG_INT64(0);
+ int64 b = (int64) PG_GETARG_INT32(1);
+ int64 r;
+
+ if (pg_sub_s64_overflow(a, b, &r) ||
+ r == PG_INT64_MIN)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ PG_RETURN_INT64(i64abs(r));
+}
+
/**************************************************
* GiST support functions
**************************************************/
Datum
@@ -134,47 +168,61 @@ gbt_int8_fetch(PG_FUNCTION_ARGS)
}
Datum
gbt_int8_consistent(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- int64 query = PG_GETARG_INT64(1);
+ Datum queryDatum = PG_GETARG_DATUM(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
-#ifdef NOT_USED
Oid subtype = PG_GETARG_OID(3);
-#endif
bool *recheck = (bool *) PG_GETARG_POINTER(4);
int64KEY *kkk = (int64KEY *) DatumGetPointer(entry->key);
+ int64 query;
GBT_NUMKEY_R key;
/* All cases served by this function are exact */
*recheck = false;
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
- PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
- GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
+ if (likely(subtype == InvalidOid || subtype == INT8OID))
+ {
+ query = DatumGetInt64(queryDatum);
+ PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
+ GIST_LEAF(entry), &tinfo,
+ fcinfo->flinfo));
+ }
+
+ PG_RETURN_BOOL(gbt_int_consistent_x(kkk->lower, kkk->upper,
+ queryDatum, subtype, &strategy,
+ GIST_LEAF(entry)));
}
Datum
gbt_int8_distance(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- int64 query = PG_GETARG_INT64(1);
-#ifdef NOT_USED
+ Datum queryDatum = PG_GETARG_DATUM(1);
Oid subtype = PG_GETARG_OID(3);
-#endif
int64KEY *kkk = (int64KEY *) DatumGetPointer(entry->key);
+ int64 query;
GBT_NUMKEY_R key;
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
- PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
- &tinfo, fcinfo->flinfo));
+ if (likely(subtype == InvalidOid || subtype == INT8OID))
+ {
+ query = DatumGetInt64(queryDatum);
+ PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
+ &tinfo, fcinfo->flinfo));
+ }
+
+ PG_RETURN_FLOAT8(gbt_int_distance_x(kkk->lower, kkk->upper,
+ queryDatum, subtype));
}
Datum
gbt_int8_union(PG_FUNCTION_ARGS)
{
GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
diff --git a/contrib/btree_gist/btree_utils_num.c b/contrib/btree_gist/btree_utils_num.c
index 3affe4c2c46..cff47e2d876 100644
--- a/contrib/btree_gist/btree_utils_num.c
+++ b/contrib/btree_gist/btree_utils_num.c
@@ -2,12 +2,14 @@
* contrib/btree_gist/btree_utils_num.c
*/
#include "postgres.h"
#include "btree_gist.h"
#include "btree_utils_num.h"
+#include "catalog/pg_type.h"
+#include "funcapi.h"
#include "utils/cash.h"
#include "utils/date.h"
#include "utils/timestamp.h"
GISTENTRY *
@@ -304,12 +306,144 @@ gbt_num_consistent(const GBT_NUMKEY_R *key,
retval = false;
}
return retval;
}
+/*
+ * Cross-type dispatch table for the integer opclasses.
+ *
+ * This is the single source of truth for which query subtypes the integer
+ * cross-type path understands. gbt_int_query_to_int64() promotes exactly the
+ * subtypes listed here, and the SQL-visible gbt_int_crosstype_subtypes()
+ * exposes the same list so the regression tests can assert that the cross-type
+ * pg_amop entries in gist_int{2,4,8}_ops never drift from what the C code can
+ * actually handle.
+ */
+typedef int64 (*gbt_int_promote_fn) (Datum query);
+
+static int64
+gbt_int2_to_int64(Datum query)
+{
+ return (int64) DatumGetInt16(query);
+}
+
+static int64
+gbt_int4_to_int64(Datum query)
+{
+ return (int64) DatumGetInt32(query);
+}
+
+static int64
+gbt_int8_to_int64(Datum query)
+{
+ return DatumGetInt64(query);
+}
+
+typedef struct
+{
+ Oid subtype;
+ gbt_int_promote_fn promote;
+} gbt_int_crosstype;
+
+static const gbt_int_crosstype gbt_int_crosstype_table[] = {
+ {INT2OID, gbt_int2_to_int64},
+ {INT4OID, gbt_int4_to_int64},
+ {INT8OID, gbt_int8_to_int64},
+};
+
+/*
+ * Promote a cross-type integer query value to int64.
+ *
+ * A subtype outside gbt_int_crosstype_table means the C dispatch is out of sync
+ * with the operator-family registrations in pg_amop, so we treat it as an
+ * internal error. The regression tests assert this never happens.
+ */
+static int64
+gbt_int_query_to_int64(Datum query, Oid subtype)
+{
+ int i;
+
+ for (i = 0; i < lengthof(gbt_int_crosstype_table); i++)
+ {
+ if (gbt_int_crosstype_table[i].subtype == subtype)
+ return gbt_int_crosstype_table[i].promote(query);
+ }
+
+ elog(ERROR, "unrecognized subtype %u for btree_gist integer cross-type comparison",
+ subtype);
+ return 0; /* keep compiler quiet */
+}
+
+/*
+ * gbt_int_crosstype_subtypes
+ *
+ * Expose the contents of gbt_int_crosstype_table to SQL as a set of type OIDs.
+ * The btree_gist regression tests use this to check that the cross-type pg_amop
+ * entries in gist_int{2,4,8}_ops stay in agreement with the C dispatch.
+ */
+PG_FUNCTION_INFO_V1(gbt_int_crosstype_subtypes);
+Datum
+gbt_int_crosstype_subtypes(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+
+ if (SRF_IS_FIRSTCALL())
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ funcctx = SRF_PERCALL_SETUP();
+
+ if (funcctx->call_cntr < lengthof(gbt_int_crosstype_table))
+ {
+ Oid subtype = gbt_int_crosstype_table[funcctx->call_cntr].subtype;
+
+ SRF_RETURN_NEXT(funcctx, ObjectIdGetDatum(subtype));
+ }
+
+ SRF_RETURN_DONE(funcctx);
+}
+
+/*
+ * Cross-type consistent method for the integer opclasses.
+ *
+ * The key range [lower, upper] and the query value are all compared as int64.
+ * The strategy logic mirrors the same-type path in gbt_num_consistent(): the
+ * query value keeps its own width (no narrowing to the indexed column type),
+ * so out-of-range constants behave according to normal integer comparison.
+ */
+bool
+gbt_int_consistent_x(int64 lower, int64 upper, Datum query, Oid subtype,
+ const StrategyNumber *strategy, bool is_leaf)
+{
+ int64 q = gbt_int_query_to_int64(query, subtype);
+
+ switch (*strategy)
+ {
+ case BTLessEqualStrategyNumber:
+ /* some k in [lower,upper] has k <= q iff lower <= q */
+ return lower <= q;
+ case BTLessStrategyNumber:
+ /* leaf: key < q. internal: lower <= q (loose) */
+ return is_leaf ? (lower < q) : (lower <= q);
+ case BTEqualStrategyNumber:
+ if (is_leaf)
+ return lower == q;
+ /* internal: lower <= q <= upper */
+ return (lower <= q && q <= upper);
+ case BTGreaterStrategyNumber:
+ /* leaf: key > q. internal: upper >= q (loose) */
+ return is_leaf ? (upper > q) : (upper >= q);
+ case BTGreaterEqualStrategyNumber:
+ return upper >= q;
+ case BtreeGistNotEqualStrategyNumber:
+ return !(lower == q && upper == q);
+ default:
+ return false;
+ }
+}
+
/*
* The GiST distance method (for KNN-Gist)
*/
float8
@@ -331,12 +465,38 @@ gbt_num_distance(const GBT_NUMKEY_R *key,
else
retval = 0.0;
return retval;
}
+/*
+ * Cross-type distance method for the integer opclasses.
+ *
+ * The distance from the query value to the key range is computed in int64 and
+ * returned as a float8, matching the same-type integer distance behaviour.
+ *
+ * Note that the subtraction is performed in float8, so when the key bound and
+ * the query value differ by more than 2^53 the returned distance loses
+ * precision. This only affects the ordering of KNN results that are nearly
+ * equidistant at that scale; it never changes which rows are returned.
+ */
+float8
+gbt_int_distance_x(int64 lower, int64 upper, Datum query, Oid subtype)
+{
+ int64 q = gbt_int_query_to_int64(query, subtype);
+
+ /* query below the range: distance to lower bound */
+ if (lower >= q)
+ return fabs((float8) lower - (float8) q);
+ /* query above the range: distance to upper bound */
+ if (upper <= q)
+ return fabs((float8) upper - (float8) q);
+ /* query inside the range */
+ return 0.0;
+}
+
GIST_SPLITVEC *
gbt_num_picksplit(const GistEntryVector *entryvec, GIST_SPLITVEC *v,
const gbtree_ninfo *tinfo, FmgrInfo *flinfo)
{
OffsetNumber i,
diff --git a/contrib/btree_gist/btree_utils_num.h b/contrib/btree_gist/btree_utils_num.h
index 53e477d8b1e..d378b0df37e 100644
--- a/contrib/btree_gist/btree_utils_num.h
+++ b/contrib/btree_gist/btree_utils_num.h
@@ -95,12 +95,28 @@ extern bool gbt_num_consistent(const GBT_NUMKEY_R *key, const void *query,
const StrategyNumber *strategy, bool is_leaf,
const gbtree_ninfo *tinfo, FmgrInfo *flinfo);
extern float8 gbt_num_distance(const GBT_NUMKEY_R *key, const void *query,
bool is_leaf, const gbtree_ninfo *tinfo, FmgrInfo *flinfo);
+/*
+ * Cross-type consistent/distance helpers for the integer opclasses
+ * (int2/int4/int8). All three integer widths promote losslessly to int64, so
+ * a cross-type query reduces to a plain int64 comparison and there is no need
+ * for the per-type callback machinery the same-type path uses. These helpers
+ * are deliberately integer-specific rather than a general cross-type framework;
+ * other type families should grow their own helpers shaped to their needs as
+ * the need arises.
+ */
+extern bool gbt_int_consistent_x(int64 lower, int64 upper,
+ Datum query, Oid subtype,
+ const StrategyNumber *strategy, bool is_leaf);
+
+extern float8 gbt_int_distance_x(int64 lower, int64 upper,
+ Datum query, Oid subtype);
+
extern GIST_SPLITVEC *gbt_num_picksplit(const GistEntryVector *entryvec, GIST_SPLITVEC *v,
const gbtree_ninfo *tinfo, FmgrInfo *flinfo);
extern GISTENTRY *gbt_num_compress(GISTENTRY *entry, const gbtree_ninfo *tinfo);
extern GISTENTRY *gbt_num_fetch(GISTENTRY *entry, const gbtree_ninfo *tinfo);
diff --git a/contrib/btree_gist/meson.build b/contrib/btree_gist/meson.build
index 2b1a5463289..c640cb7d8d2 100644
--- a/contrib/btree_gist/meson.build
+++ b/contrib/btree_gist/meson.build
@@ -49,12 +49,13 @@ install_data(
'btree_gist--1.4--1.5.sql',
'btree_gist--1.5--1.6.sql',
'btree_gist--1.6--1.7.sql',
'btree_gist--1.7--1.8.sql',
'btree_gist--1.8--1.9.sql',
'btree_gist--1.9.sql',
+ 'btree_gist--1.9--1.10.sql',
kwargs: contrib_data_args,
)
tests += {
'name': 'btree_gist',
'sd': meson.current_source_dir(),
@@ -89,10 +90,11 @@ tests += {
'uuid',
'not_equal',
'enum',
'bool',
'partitions',
'stratnum',
+ 'int_crosstype',
'without_overlaps',
],
},
}
--
2.51.0
[application/octet-stream] 0002-Add-tests-for-cross-type-operators-for-GiST-indexes.patch (23.9K, 4-0002-Add-tests-for-cross-type-operators-for-GiST-indexes.patch)
download | inline diff:
From e6204108239a2458be11aa2286edd7bd029a3681 Mon Sep 17 00:00:00 2001
From: Alexander Nestorov <[email protected]>
Date: Thu, 4 Jun 2026 00:08:27 +0200
Subject: [PATCH] Add tests for cross-type operators for GiST indexes
---
contrib/btree_gist/expected/int_crosstype.out | 568 ++++++++++++++++++
contrib/btree_gist/sql/int_crosstype.sql | 236 ++++++++
2 files changed, 804 insertions(+)
create mode 100644 contrib/btree_gist/expected/int_crosstype.out
create mode 100644 contrib/btree_gist/sql/int_crosstype.sql
diff --git a/contrib/btree_gist/expected/int_crosstype.out b/contrib/btree_gist/expected/int_crosstype.out
new file mode 100644
index 00000000000..bd2b90bd3f6
--- /dev/null
+++ b/contrib/btree_gist/expected/int_crosstype.out
@@ -0,0 +1,568 @@
+-- Cross-type operator support for int2/int4/int8 in GiST.
+--
+-- Verifies that (a) the cross-type B-tree-style operators registered in
+-- gist_int{2,4,8}_ops match the results of seqscans using the same operator
+-- expressions, (b) the KNN <-> operator works across types and uses the
+-- index, and (c) values outside the smaller subtype's range are handled
+-- according to normal comparison semantics, without narrowing or erroring.
+--
+-- Catalog invariant: the cross-type pg_amop entries in gist_int{2,4,8}_ops must
+-- agree, in both directions, with what the C cross-type dispatch can handle.
+-- The set of supported query subtypes is read from the C side via
+-- gbt_int_crosstype_subtypes(), so there is no hand-maintained second copy of
+-- the list here: registering a pg_amop row whose subtype the dispatch does not
+-- handle (or dropping a dispatch entry while its pg_amop rows remain, or vice
+-- versa) shows up as a diff below. Cross-type pg_amproc rows must also stay
+-- absent, since the dispatch reuses the same-type support functions.
+--
+WITH dispatch_subtypes(typ) AS (
+ SELECT s.subtype::regtype
+ FROM gbt_int_crosstype_subtypes() AS s(subtype)
+),
+gist_int_opclasses(opfamily, indextype) AS (
+ SELECT opc.opcname::text, opc.opcintype::regtype
+ FROM pg_opclass opc
+ JOIN pg_am am ON am.oid = opc.opcmethod
+ WHERE am.amname = 'gist'
+ AND opc.opcname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+),
+expected_pairs(opfamily, lefttype, righttype) AS (
+ SELECT oc.opfamily, oc.indextype, ds.typ
+ FROM gist_int_opclasses oc
+ CROSS JOIN dispatch_subtypes ds
+ WHERE ds.typ <> oc.indextype
+),
+expected_amop(opfamily, lefttype, righttype, strategy, purpose) AS (
+ SELECT opfamily, lefttype, righttype, strategy, purpose
+ FROM expected_pairs
+ CROSS JOIN (VALUES
+ (1, 's'), (2, 's'), (3, 's'), (4, 's'),
+ (5, 's'), (6, 's'), (15, 'o')
+ ) AS strategy_purposes(strategy, purpose)
+),
+actual_amop AS (
+ SELECT opf.opfname::text AS opfamily,
+ amop.amoplefttype::regtype AS lefttype,
+ amop.amoprighttype::regtype AS righttype,
+ amop.amopstrategy::int AS strategy,
+ amop.amoppurpose::text AS purpose
+ FROM pg_amop amop
+ JOIN pg_opfamily opf ON opf.oid = amop.amopfamily
+ JOIN pg_am am ON am.oid = opf.opfmethod
+ WHERE am.amname = 'gist'
+ AND opf.opfname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+ AND amop.amoplefttype <> amop.amoprighttype
+)
+SELECT *
+FROM (
+ SELECT 'missing from pg_amop' AS status, *
+ FROM (SELECT * FROM expected_amop EXCEPT SELECT * FROM actual_amop) missing
+ UNION ALL
+ SELECT 'unexpected in pg_amop' AS status, *
+ FROM (SELECT * FROM actual_amop EXCEPT SELECT * FROM expected_amop) unexpected
+) diff
+ORDER BY status, opfamily, lefttype::text, righttype::text, strategy, purpose;
+ status | opfamily | lefttype | righttype | strategy | purpose
+--------+----------+----------+-----------+----------+---------
+(0 rows)
+
+SELECT opf.opfname AS opfamily,
+ amproc.amproclefttype::regtype AS lefttype,
+ amproc.amprocrighttype::regtype AS righttype,
+ amproc.amprocnum AS procnum,
+ amproc.amproc::regproc AS proc
+FROM pg_amproc amproc
+ JOIN pg_opfamily opf ON opf.oid = amproc.amprocfamily
+ JOIN pg_am am ON am.oid = opf.opfmethod
+WHERE am.amname = 'gist'
+ AND opf.opfname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+ AND amproc.amproclefttype <> amproc.amprocrighttype
+ORDER BY opf.opfname,
+ amproc.amproclefttype::regtype::text,
+ amproc.amprocrighttype::regtype::text,
+ amproc.amprocnum,
+ amproc.amproc::regproc::text;
+ opfamily | lefttype | righttype | procnum | proc
+----------+----------+-----------+---------+------
+(0 rows)
+
+CREATE TABLE ct_i2 (a int2);
+CREATE TABLE ct_i4 (a int4);
+CREATE TABLE ct_i8 (a int8);
+INSERT INTO ct_i2 SELECT g::int2 FROM generate_series(-100, 100) g;
+INSERT INTO ct_i4 SELECT g FROM generate_series(-100, 100) g;
+INSERT INTO ct_i8 SELECT g::int8 FROM generate_series(-100, 100) g;
+-- Add some values that are representable only in wider types, to exercise
+-- the path where the cross-type query constant is out of range of the key
+-- type.
+INSERT INTO ct_i4 VALUES (100000), (-100000);
+INSERT INTO ct_i8 VALUES (5000000000), (-5000000000);
+CREATE INDEX ct_i2_idx ON ct_i2 USING gist (a);
+CREATE INDEX ct_i4_idx ON ct_i4 USING gist (a);
+CREATE INDEX ct_i8_idx ON ct_i8 USING gist (a);
+ANALYZE ct_i2;
+ANALYZE ct_i4;
+ANALYZE ct_i8;
+SET enable_seqscan = off;
+SET enable_bitmapscan = off;
+-- int2 key x int4 query
+SELECT count(*) FROM ct_i2 WHERE a < 50::int4;
+ count
+-------
+ 150
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <= 50::int4;
+ count
+-------
+ 151
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a = 50::int4;
+ count
+-------
+ 1
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a >= 50::int4;
+ count
+-------
+ 51
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a > 50::int4;
+ count
+-------
+ 50
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <> 50::int4;
+ count
+-------
+ 200
+(1 row)
+
+-- query out of int2 range: matches nothing for =, everything for <>
+SELECT count(*) FROM ct_i2 WHERE a = 100000::int4;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <> 100000::int4;
+ count
+-------
+ 201
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a < 100000::int4;
+ count
+-------
+ 201
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a > 100000::int4;
+ count
+-------
+ 0
+(1 row)
+
+-- int2 key x int8 query
+SELECT count(*) FROM ct_i2 WHERE a < 50::int8;
+ count
+-------
+ 150
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <= 50::int8;
+ count
+-------
+ 151
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a = 50::int8;
+ count
+-------
+ 1
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a >= 50::int8;
+ count
+-------
+ 51
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a > 50::int8;
+ count
+-------
+ 50
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a <> 50::int8;
+ count
+-------
+ 200
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a = 5000000000::int8;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM ct_i2 WHERE a < 5000000000::int8;
+ count
+-------
+ 201
+(1 row)
+
+-- int4 key x int2 query
+SELECT count(*) FROM ct_i4 WHERE a < 50::int2;
+ count
+-------
+ 151
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <= 50::int2;
+ count
+-------
+ 152
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a = 50::int2;
+ count
+-------
+ 1
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a >= 50::int2;
+ count
+-------
+ 52
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a > 50::int2;
+ count
+-------
+ 51
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <> 50::int2;
+ count
+-------
+ 202
+(1 row)
+
+-- int4 key x int8 query
+SELECT count(*) FROM ct_i4 WHERE a < 50::int8;
+ count
+-------
+ 151
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <= 50::int8;
+ count
+-------
+ 152
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a = 50::int8;
+ count
+-------
+ 1
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a >= 50::int8;
+ count
+-------
+ 52
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a > 50::int8;
+ count
+-------
+ 51
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a <> 50::int8;
+ count
+-------
+ 202
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a = 5000000000::int8;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM ct_i4 WHERE a < 5000000000::int8;
+ count
+-------
+ 203
+(1 row)
+
+-- int8 key x int2 query
+SELECT count(*) FROM ct_i8 WHERE a < 50::int2;
+ count
+-------
+ 151
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a <= 50::int2;
+ count
+-------
+ 152
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a = 50::int2;
+ count
+-------
+ 1
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a >= 50::int2;
+ count
+-------
+ 52
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a > 50::int2;
+ count
+-------
+ 51
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a <> 50::int2;
+ count
+-------
+ 202
+(1 row)
+
+-- int8 key x int4 query
+SELECT count(*) FROM ct_i8 WHERE a < 50::int4;
+ count
+-------
+ 151
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a <= 50::int4;
+ count
+-------
+ 152
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a = 50::int4;
+ count
+-------
+ 1
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a >= 50::int4;
+ count
+-------
+ 52
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a > 50::int4;
+ count
+-------
+ 51
+(1 row)
+
+SELECT count(*) FROM ct_i8 WHERE a <> 50::int4;
+ count
+-------
+ 202
+(1 row)
+
+-- Confirm the index is actually used for a cross-type predicate.
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_i4 WHERE a = 50::int8;
+ QUERY PLAN
+------------------------------------------------
+ Aggregate
+ -> Index Only Scan using ct_i4_idx on ct_i4
+ Index Cond: (a = '50'::bigint)
+(3 rows)
+
+-- Cross-type KNN: int4 key ordered by int2 / int8 queries.
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int2 LIMIT 3;
+ QUERY PLAN
+------------------------------------------------
+ Limit
+ -> Index Only Scan using ct_i4_idx on ct_i4
+ Order By: (a <-> '-100'::smallint)
+(3 rows)
+
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int2 LIMIT 3;
+ a
+------
+ -100
+ -99
+ -98
+(3 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int8 LIMIT 3;
+ QUERY PLAN
+------------------------------------------------
+ Limit
+ -> Index Only Scan using ct_i4_idx on ct_i4
+ Order By: (a <-> '-100'::bigint)
+(3 rows)
+
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int8 LIMIT 3;
+ a
+------
+ -100
+ -99
+ -98
+(3 rows)
+
+-- Cross-type KNN: int2 key ordered by int4 / int8 queries.
+SELECT a FROM ct_i2 ORDER BY a <-> '-100'::int4 LIMIT 3;
+ a
+------
+ -100
+ -99
+ -98
+(3 rows)
+
+SELECT a FROM ct_i2 ORDER BY a <-> '-100'::int8 LIMIT 3;
+ a
+------
+ -100
+ -99
+ -98
+(3 rows)
+
+-- Cross-type KNN: int8 key ordered by int2 / int4 queries.
+SELECT a FROM ct_i8 ORDER BY a <-> '-100'::int2 LIMIT 3;
+ a
+------
+ -100
+ -99
+ -98
+(3 rows)
+
+SELECT a FROM ct_i8 ORDER BY a <-> '-100'::int4 LIMIT 3;
+ a
+------
+ -100
+ -99
+ -98
+(3 rows)
+
+-- Combined: cross-type WHERE + cross-type ORDER BY on the same index.
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 WHERE a < 80::int8 ORDER BY a <-> '-100'::int8 LIMIT 3;
+ QUERY PLAN
+------------------------------------------------
+ Limit
+ -> Index Only Scan using ct_i4_idx on ct_i4
+ Index Cond: (a < '80'::bigint)
+ Order By: (a <-> '-100'::bigint)
+(4 rows)
+
+SELECT a FROM ct_i4 WHERE a < 80::int8 ORDER BY a <-> '-100'::int8 LIMIT 3;
+ a
+------
+ -100
+ -99
+ -98
+(3 rows)
+
+-- Standalone distance-function smoke tests (not going through the index),
+-- including the overflow-detection paths.
+SELECT int2_int4_dist(3::int2, 10::int4);
+ int2_int4_dist
+----------------
+ 7
+(1 row)
+
+SELECT int4_int2_dist(-5::int4, 5::int2);
+ int4_int2_dist
+----------------
+ 10
+(1 row)
+
+SELECT int2_int8_dist(3::int2, 10::int8);
+ int2_int8_dist
+----------------
+ 7
+(1 row)
+
+SELECT int8_int2_dist(100::int8, -5::int2);
+ int8_int2_dist
+----------------
+ 105
+(1 row)
+
+SELECT int4_int8_dist(100::int4, 5000000000::int8);
+ int4_int8_dist
+----------------
+ 4999999900
+(1 row)
+
+SELECT int8_int4_dist(5000000000::int8, 100::int4);
+ int8_int4_dist
+----------------
+ 4999999900
+(1 row)
+
+-- Overflow detection: INT32_MIN distance from a positive int2 can't fit
+-- in int32, should error.
+SELECT int2_int4_dist(1::int2, -2147483648::int4);
+ERROR: integer out of range
+-- Likewise INT64_MIN distance from a positive int4 can't fit in int64.
+SELECT int4_int8_dist(1::int4, -9223372036854775808::int8);
+ERROR: bigint out of range
+--
+-- Multi-column GiST index with mixed-type predicates. This is the
+-- original motivating case: without cross-type operator support the
+-- planner can only use one column as an Index Cond and applies the
+-- other(s) as a Filter post-scan. Here both columns should appear as
+-- Index Cond.
+--
+CREATE TABLE ct_multi (a int4, b int8);
+INSERT INTO ct_multi
+ SELECT g, (g * 2)::int8 FROM generate_series(-50, 50) g;
+CREATE INDEX ct_multi_idx ON ct_multi USING gist (a, b);
+ANALYZE ct_multi;
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_multi WHERE a = 25::int8 AND b = 50::int4;
+ QUERY PLAN
+-------------------------------------------------------
+ Aggregate
+ -> Index Only Scan using ct_multi_idx on ct_multi
+ Index Cond: ((a = '25'::bigint) AND (b = 50))
+(3 rows)
+
+SELECT count(*) FROM ct_multi WHERE a = 25::int8 AND b = 50::int4;
+ count
+-------
+ 1
+(1 row)
+
+-- Mixed cross-type ranges across both columns.
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_multi WHERE a < 10::int8 AND b > 0::int2;
+ QUERY PLAN
+------------------------------------------------------------------
+ Aggregate
+ -> Index Only Scan using ct_multi_idx on ct_multi
+ Index Cond: ((a < '10'::bigint) AND (b > '0'::smallint))
+(3 rows)
+
+SELECT count(*) FROM ct_multi WHERE a < 10::int8 AND b > 0::int2;
+ count
+-------
+ 9
+(1 row)
+
+DROP TABLE ct_multi;
+DROP TABLE ct_i2;
+DROP TABLE ct_i4;
+DROP TABLE ct_i8;
diff --git a/contrib/btree_gist/sql/int_crosstype.sql b/contrib/btree_gist/sql/int_crosstype.sql
new file mode 100644
index 00000000000..d38084f1941
--- /dev/null
+++ b/contrib/btree_gist/sql/int_crosstype.sql
@@ -0,0 +1,236 @@
+-- Cross-type operator support for int2/int4/int8 in GiST.
+--
+-- Verifies that (a) the cross-type B-tree-style operators registered in
+-- gist_int{2,4,8}_ops match the results of seqscans using the same operator
+-- expressions, (b) the KNN <-> operator works across types and uses the
+-- index, and (c) values outside the smaller subtype's range are handled
+-- according to normal comparison semantics, without narrowing or erroring.
+
+--
+-- Catalog invariant: the cross-type pg_amop entries in gist_int{2,4,8}_ops must
+-- agree, in both directions, with what the C cross-type dispatch can handle.
+-- The set of supported query subtypes is read from the C side via
+-- gbt_int_crosstype_subtypes(), so there is no hand-maintained second copy of
+-- the list here: registering a pg_amop row whose subtype the dispatch does not
+-- handle (or dropping a dispatch entry while its pg_amop rows remain, or vice
+-- versa) shows up as a diff below. Cross-type pg_amproc rows must also stay
+-- absent, since the dispatch reuses the same-type support functions.
+--
+WITH dispatch_subtypes(typ) AS (
+ SELECT s.subtype::regtype
+ FROM gbt_int_crosstype_subtypes() AS s(subtype)
+),
+gist_int_opclasses(opfamily, indextype) AS (
+ SELECT opc.opcname::text, opc.opcintype::regtype
+ FROM pg_opclass opc
+ JOIN pg_am am ON am.oid = opc.opcmethod
+ WHERE am.amname = 'gist'
+ AND opc.opcname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+),
+expected_pairs(opfamily, lefttype, righttype) AS (
+ SELECT oc.opfamily, oc.indextype, ds.typ
+ FROM gist_int_opclasses oc
+ CROSS JOIN dispatch_subtypes ds
+ WHERE ds.typ <> oc.indextype
+),
+expected_amop(opfamily, lefttype, righttype, strategy, purpose) AS (
+ SELECT opfamily, lefttype, righttype, strategy, purpose
+ FROM expected_pairs
+ CROSS JOIN (VALUES
+ (1, 's'), (2, 's'), (3, 's'), (4, 's'),
+ (5, 's'), (6, 's'), (15, 'o')
+ ) AS strategy_purposes(strategy, purpose)
+),
+actual_amop AS (
+ SELECT opf.opfname::text AS opfamily,
+ amop.amoplefttype::regtype AS lefttype,
+ amop.amoprighttype::regtype AS righttype,
+ amop.amopstrategy::int AS strategy,
+ amop.amoppurpose::text AS purpose
+ FROM pg_amop amop
+ JOIN pg_opfamily opf ON opf.oid = amop.amopfamily
+ JOIN pg_am am ON am.oid = opf.opfmethod
+ WHERE am.amname = 'gist'
+ AND opf.opfname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+ AND amop.amoplefttype <> amop.amoprighttype
+)
+SELECT *
+FROM (
+ SELECT 'missing from pg_amop' AS status, *
+ FROM (SELECT * FROM expected_amop EXCEPT SELECT * FROM actual_amop) missing
+ UNION ALL
+ SELECT 'unexpected in pg_amop' AS status, *
+ FROM (SELECT * FROM actual_amop EXCEPT SELECT * FROM expected_amop) unexpected
+) diff
+ORDER BY status, opfamily, lefttype::text, righttype::text, strategy, purpose;
+
+SELECT opf.opfname AS opfamily,
+ amproc.amproclefttype::regtype AS lefttype,
+ amproc.amprocrighttype::regtype AS righttype,
+ amproc.amprocnum AS procnum,
+ amproc.amproc::regproc AS proc
+FROM pg_amproc amproc
+ JOIN pg_opfamily opf ON opf.oid = amproc.amprocfamily
+ JOIN pg_am am ON am.oid = opf.opfmethod
+WHERE am.amname = 'gist'
+ AND opf.opfname IN ('gist_int2_ops', 'gist_int4_ops', 'gist_int8_ops')
+ AND amproc.amproclefttype <> amproc.amprocrighttype
+ORDER BY opf.opfname,
+ amproc.amproclefttype::regtype::text,
+ amproc.amprocrighttype::regtype::text,
+ amproc.amprocnum,
+ amproc.amproc::regproc::text;
+
+CREATE TABLE ct_i2 (a int2);
+CREATE TABLE ct_i4 (a int4);
+CREATE TABLE ct_i8 (a int8);
+
+INSERT INTO ct_i2 SELECT g::int2 FROM generate_series(-100, 100) g;
+INSERT INTO ct_i4 SELECT g FROM generate_series(-100, 100) g;
+INSERT INTO ct_i8 SELECT g::int8 FROM generate_series(-100, 100) g;
+
+-- Add some values that are representable only in wider types, to exercise
+-- the path where the cross-type query constant is out of range of the key
+-- type.
+INSERT INTO ct_i4 VALUES (100000), (-100000);
+INSERT INTO ct_i8 VALUES (5000000000), (-5000000000);
+
+CREATE INDEX ct_i2_idx ON ct_i2 USING gist (a);
+CREATE INDEX ct_i4_idx ON ct_i4 USING gist (a);
+CREATE INDEX ct_i8_idx ON ct_i8 USING gist (a);
+
+ANALYZE ct_i2;
+ANALYZE ct_i4;
+ANALYZE ct_i8;
+
+SET enable_seqscan = off;
+SET enable_bitmapscan = off;
+
+-- int2 key x int4 query
+SELECT count(*) FROM ct_i2 WHERE a < 50::int4;
+SELECT count(*) FROM ct_i2 WHERE a <= 50::int4;
+SELECT count(*) FROM ct_i2 WHERE a = 50::int4;
+SELECT count(*) FROM ct_i2 WHERE a >= 50::int4;
+SELECT count(*) FROM ct_i2 WHERE a > 50::int4;
+SELECT count(*) FROM ct_i2 WHERE a <> 50::int4;
+
+-- query out of int2 range: matches nothing for =, everything for <>
+SELECT count(*) FROM ct_i2 WHERE a = 100000::int4;
+SELECT count(*) FROM ct_i2 WHERE a <> 100000::int4;
+SELECT count(*) FROM ct_i2 WHERE a < 100000::int4;
+SELECT count(*) FROM ct_i2 WHERE a > 100000::int4;
+
+-- int2 key x int8 query
+SELECT count(*) FROM ct_i2 WHERE a < 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a <= 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a = 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a >= 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a > 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a <> 50::int8;
+SELECT count(*) FROM ct_i2 WHERE a = 5000000000::int8;
+SELECT count(*) FROM ct_i2 WHERE a < 5000000000::int8;
+
+-- int4 key x int2 query
+SELECT count(*) FROM ct_i4 WHERE a < 50::int2;
+SELECT count(*) FROM ct_i4 WHERE a <= 50::int2;
+SELECT count(*) FROM ct_i4 WHERE a = 50::int2;
+SELECT count(*) FROM ct_i4 WHERE a >= 50::int2;
+SELECT count(*) FROM ct_i4 WHERE a > 50::int2;
+SELECT count(*) FROM ct_i4 WHERE a <> 50::int2;
+
+-- int4 key x int8 query
+SELECT count(*) FROM ct_i4 WHERE a < 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a <= 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a = 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a >= 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a > 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a <> 50::int8;
+SELECT count(*) FROM ct_i4 WHERE a = 5000000000::int8;
+SELECT count(*) FROM ct_i4 WHERE a < 5000000000::int8;
+
+-- int8 key x int2 query
+SELECT count(*) FROM ct_i8 WHERE a < 50::int2;
+SELECT count(*) FROM ct_i8 WHERE a <= 50::int2;
+SELECT count(*) FROM ct_i8 WHERE a = 50::int2;
+SELECT count(*) FROM ct_i8 WHERE a >= 50::int2;
+SELECT count(*) FROM ct_i8 WHERE a > 50::int2;
+SELECT count(*) FROM ct_i8 WHERE a <> 50::int2;
+
+-- int8 key x int4 query
+SELECT count(*) FROM ct_i8 WHERE a < 50::int4;
+SELECT count(*) FROM ct_i8 WHERE a <= 50::int4;
+SELECT count(*) FROM ct_i8 WHERE a = 50::int4;
+SELECT count(*) FROM ct_i8 WHERE a >= 50::int4;
+SELECT count(*) FROM ct_i8 WHERE a > 50::int4;
+SELECT count(*) FROM ct_i8 WHERE a <> 50::int4;
+
+-- Confirm the index is actually used for a cross-type predicate.
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_i4 WHERE a = 50::int8;
+
+-- Cross-type KNN: int4 key ordered by int2 / int8 queries.
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int2 LIMIT 3;
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int2 LIMIT 3;
+
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int8 LIMIT 3;
+SELECT a FROM ct_i4 ORDER BY a <-> '-100'::int8 LIMIT 3;
+
+-- Cross-type KNN: int2 key ordered by int4 / int8 queries.
+SELECT a FROM ct_i2 ORDER BY a <-> '-100'::int4 LIMIT 3;
+SELECT a FROM ct_i2 ORDER BY a <-> '-100'::int8 LIMIT 3;
+
+-- Cross-type KNN: int8 key ordered by int2 / int4 queries.
+SELECT a FROM ct_i8 ORDER BY a <-> '-100'::int2 LIMIT 3;
+SELECT a FROM ct_i8 ORDER BY a <-> '-100'::int4 LIMIT 3;
+
+-- Combined: cross-type WHERE + cross-type ORDER BY on the same index.
+EXPLAIN (COSTS OFF)
+SELECT a FROM ct_i4 WHERE a < 80::int8 ORDER BY a <-> '-100'::int8 LIMIT 3;
+SELECT a FROM ct_i4 WHERE a < 80::int8 ORDER BY a <-> '-100'::int8 LIMIT 3;
+
+-- Standalone distance-function smoke tests (not going through the index),
+-- including the overflow-detection paths.
+SELECT int2_int4_dist(3::int2, 10::int4);
+SELECT int4_int2_dist(-5::int4, 5::int2);
+SELECT int2_int8_dist(3::int2, 10::int8);
+SELECT int8_int2_dist(100::int8, -5::int2);
+SELECT int4_int8_dist(100::int4, 5000000000::int8);
+SELECT int8_int4_dist(5000000000::int8, 100::int4);
+
+-- Overflow detection: INT32_MIN distance from a positive int2 can't fit
+-- in int32, should error.
+SELECT int2_int4_dist(1::int2, -2147483648::int4);
+-- Likewise INT64_MIN distance from a positive int4 can't fit in int64.
+SELECT int4_int8_dist(1::int4, -9223372036854775808::int8);
+
+--
+-- Multi-column GiST index with mixed-type predicates. This is the
+-- original motivating case: without cross-type operator support the
+-- planner can only use one column as an Index Cond and applies the
+-- other(s) as a Filter post-scan. Here both columns should appear as
+-- Index Cond.
+--
+CREATE TABLE ct_multi (a int4, b int8);
+INSERT INTO ct_multi
+ SELECT g, (g * 2)::int8 FROM generate_series(-50, 50) g;
+CREATE INDEX ct_multi_idx ON ct_multi USING gist (a, b);
+ANALYZE ct_multi;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_multi WHERE a = 25::int8 AND b = 50::int4;
+
+SELECT count(*) FROM ct_multi WHERE a = 25::int8 AND b = 50::int4;
+
+-- Mixed cross-type ranges across both columns.
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM ct_multi WHERE a < 10::int8 AND b > 0::int2;
+
+SELECT count(*) FROM ct_multi WHERE a < 10::int8 AND b > 0::int2;
+
+DROP TABLE ct_multi;
+
+DROP TABLE ct_i2;
+DROP TABLE ct_i4;
+DROP TABLE ct_i8;
--
2.51.0
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: [PATCH] btree_gist: add cross-type integer operator support for GiST
In-Reply-To: <80ef3b41-1a71-47a4-a320-29e118d7092c@Spark>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox