public inbox for [email protected]
help / color / mirror / Atom feedHash-based MCV matching for large IN-lists
6+ messages / 2 participants
[nested] [flat]
* Hash-based MCV matching for large IN-lists
@ 2025-12-29 20:35 Ilia Evdokimov <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Ilia Evdokimov @ 2025-12-29 20:35 UTC (permalink / raw)
To: PostgreSQL Hackers <[email protected]>; David Geier <[email protected]>
Hi hackers,
When estimating selectivity for ScalarArrayOpExpr (IN, ANY, ALL) and MCV
statistics are available for the column, the planner currently matches
IN-list elements against the MCV array using nested loop. For large
IN-list and large MCV arrays this results in O(N*M) behavior, which can
become unnecessarily expensive during planning.
Thanks to David for pointing out this case [0]
This patch introduces a hash-based matching path, analogous to what is
already done for MCV matching in join selectivity estimation (057012b
commit). Instead of linearly scanning the MCV array for each IN-list
element, we build a hash table and probe it to identify matches.
The hash table is built over the MCV values, not over the IN-list. The
IN-list may contain NULLs, non-Const expressions, and duplicate values,
whereas the MCV list is guaranteed to contain distinct, non-NULL values
and represents the statistically meaningful domain we are matching
against. Hashing the MCVs therefore avoids duplicate work and directly
supports selectivity estimation.
For each IN-list element, if a matching MCV is found, we add the
corresponding MCV frequency to the selectivity estimate. If no match is
found, the remaining selectivity is estimated in the same way as the
existing non-MCV path (similar to var_eq_const when the constant is not
present in the MCV list).
The hash-based path is enabled only when both a sufficiently large
IN-list and an MCV list are present, and suitable hash functions exist
for the equality operator. The threshold is currently the same as the
one used for join MCV hashing, since the underlying algorithmic
tradeoffs are similar.
Example:
CREATE TABLE t (x int);
INSERT INTO t SELECT x % 10000 FROM generate_series(1, 3000000) x;
ALTER TABLE t ALTER COLUMN x SET STATISTICS 10000;
ANALYZE t;
Before patch:
EXPLAIN (SUMMARY) SELECT * FROM t WHERE x IN (1,2,...,2000);
Seq Scan on t (cost=5.00..58280.00 rows=600000 width=4)
Filter: (x = ANY ('{1,2,...,2000}'::integer[]))
* Planning Time: 57.137 ms*
(3 rows)
After patch:
EXPLAIN (SUMMARY) SELECT * FROM t WHERE x IN (1,2,...,2000);
Seq Scan on t (cost=5.00..58280.00 rows=600000 width=4)
Filter: (x = ANY ('{1,2,...,2000}'::integer[]))
* Planning Time: 0.558 ms*
(3 rows)
Comments, suggestions, and alternative approaches are welcome!
[0]:
https://www.postgresql.org/message-id/b6316b99-565b-4c89-aa08-6aea51f54526%40gmail.com
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
[text/x-patch] v1-0001-Use-hash-based-matching-for-MCVs-in-ScalarArrayOp.patch (15.6K, 3-v1-0001-Use-hash-based-matching-for-MCVs-in-ScalarArrayOp.patch)
download | inline diff:
From 57695ce330b758fc1ac165c9c96c0a4dd397b5d0 Mon Sep 17 00:00:00 2001
From: Evdokimov Ilia <[email protected]>
Date: Mon, 29 Dec 2025 23:28:51 +0300
Subject: [PATCH v1] Use hash-based matching for MCVs in ScalarArrayOpExpr
selectivity
When estimating selectivity for ScalarArrayOpExpr (IN / ANY / ALL) with
available MCV statistics, the planner currently matches IN-list elements
against the MCV array using a nested loop, resulting in O(N*M) behavior
for large IN-lists and MCV lists.
Introduce a hash-based matching path, analogous to the one used for MCV
matching in join selectivity estimation. Build a hash table over the MCV
values and probe it for each IN-list element, avoiding repeated linear
scans of the MCV array.
The hash table is built over the MCV list rather than the IN-list, since
MCVs are guaranteed to be distinct and non-NULL, while IN-lists may
contain duplicates, NULLs, or non-Const expressions.
The hash-based path is enabled only when both a sufficiently large
IN-list and an MCV list are present, and suitable hash functions exist
for the equality operator.
---
src/backend/utils/adt/selfuncs.c | 390 ++++++++++++++++++++++++++++++-
src/tools/pgindent/typedefs.list | 3 +
2 files changed, 392 insertions(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a996f0c4939..fcdf3b7ac1e 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -146,7 +146,7 @@
/*
* In production builds, switch to hash-based MCV matching when the lists are
* large enough to amortize hash setup cost. (This threshold is compared to
- * the sum of the lengths of the two MCV lists. This is simplistic but seems
+ * the sum of the lengths of the two lists. This is simplistic but seems
* to work well enough.) In debug builds, we use a smaller threshold so that
* the regression tests cover both paths well.
*/
@@ -156,6 +156,12 @@
#define EQJOINSEL_MCV_HASH_THRESHOLD 20
#endif
+#ifndef USE_ASSERT_CHECKING
+#define SCALARARRAY_MCV_HASH_THRESHOLD 200
+#else
+#define SCALARARRAY_MCV_HASH_THRESHOLD 20
+#endif
+
/* Entries in the simplehash hash table used by eqjoinsel_find_matches */
typedef struct MCVHashEntry
{
@@ -176,14 +182,40 @@ typedef struct MCVHashContext
int16 hash_typlen; /* typlen of hashed data type */
} MCVHashContext;
+/* Entries in the simplehash hash table used by scalararray_mcv_hash_match */
+typedef struct MCVInHashEntry
+{
+ Datum value; /* the value represented by this entry */
+ int index; /* its index in the relevant AttStatsSlot */
+ uint32 hash; /* hash code for the Datum */
+ char status; /* status code used by simplehash.h */
+} MCVInHashEntry;
+
+/* private_data for the simplehash hash table */
+typedef struct MCVInHashContext
+{
+ FunctionCallInfo equal_fcinfo; /* the equality join operator */
+ FunctionCallInfo hash_fcinfo; /* the hash function to use */
+ bool insert_mode; /* doing inserts or lookups? */
+ bool hash_typbyval; /* typbyval of hashed data type */
+ int16 hash_typlen; /* typlen of hashed data type */
+} MCVInHashContext;
+
/* forward reference */
typedef struct MCVHashTable_hash MCVHashTable_hash;
+typedef struct MCVInHashTable_hash MCVInHashTable_hash;
/* Hooks for plugins to get control when we ask for stats */
get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
+static double scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left, Datum *elem_values,
+ bool *elem_nulls, int num_elems,
+ bool useOr, bool isEquality, bool isInequality);
+static uint32 hash_mcv_in(MCVInHashTable_hash *tab, Datum key);
+static bool mcvs_in_equal(MCVInHashTable_hash *tab, Datum key0, Datum key1);
static double eqjoinsel_inner(FmgrInfo *eqproc, Oid collation,
Oid hashLeft, Oid hashRight,
VariableStatData *vardata1, VariableStatData *vardata2,
@@ -287,6 +319,19 @@ static double btcost_correlation(IndexOptInfo *index,
#define SH_DECLARE
#include "lib/simplehash.h"
+#define SH_PREFIX MCVInHashTable
+#define SH_ELEMENT_TYPE MCVInHashEntry
+#define SH_KEY_TYPE Datum
+#define SH_KEY value
+#define SH_HASH_KEY(tab,key) hash_mcv_in(tab, key)
+#define SH_EQUAL(tab,key0,key1) mcvs_in_equal(tab, key0, key1)
+#define SH_SCOPE static inline
+#define SH_STORE_HASH
+#define SH_GET_HASH(tab,ent) (ent)->hash
+#define SH_DEFINE
+#define SH_DECLARE
+#include "lib/simplehash.h"
+
/*
* eqsel - Selectivity of "=" for any data types.
@@ -2025,6 +2070,35 @@ scalararraysel(PlannerInfo *root,
elmlen, elmbyval, elmalign,
&elem_values, &elem_nulls, &num_elems);
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching.
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then punt and return a default estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ s1 = scalararray_mcv_hash_match(&vardata, operator, clause->inputcollid,
+ other_op, var_on_left, elem_values,
+ elem_nulls, num_elems,
+ useOr, isEquality, isInequality);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* For generic operators, we assume the probability of success is
* independent for each array element. But for "= ANY" or "<> ALL",
@@ -2210,6 +2284,320 @@ scalararraysel(PlannerInfo *root,
return s1;
}
+/*
+ * Estimate selectivity of a ScalarArrayOpExpr using MCV statistics and,
+ * when beneficial, a hash table for efficient matching.
+ *
+ * The function evaluates selectivity by comparing each element of the
+ * IN-list against the column's most-common values (MCVs).
+ *
+ * Inputs:
+ * vardata: statistics and metadata for the variable being estimated
+ * operator: equality or inequality operator to apply
+ * collation: OID of collation to use
+ * other_op: expression for the non-variable side of the comparison
+ * var_on_left: true if the variable is on the left side of the operator
+ * elem_values: array of IN-list element values
+ * elem_nulls: array indicating which IN-list elements are NULL
+ * num_elems: number of IN-list elements
+ * useOr: true if elements are combined using OR semantics, false for AND
+ * isEquality: true if the operator is equality
+ * isInequality: true if the operator is inequality
+ *
+ * Result:
+ * Returns a selectivity estimate in the range [0.0, 1.0], or -1.0 if the
+ * selectivity cannot be reliably estimated by this function.
+ */
+static double
+scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left, Datum *elem_values,
+ bool *elem_nulls, int num_elems,
+ bool useOr, bool isEquality, bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+
+ Assert(isEquality || isInequality);
+
+ /*
+ * If we matched the var to a unique index, DISTINCT or GROUP-BY clause,
+ * assume there is exactly one match regardless of anything else. (This
+ * is slightly bogus, since the index or clause's equality operator might
+ * be different from ours, but it's much more likely to be right than
+ * ignoring the information.)
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * When asked about <>, we do the estimation using the corresponding =
+ * operator.
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+
+ /*
+ * If the combined size of the MCV list and the IN-list is large
+ * enough to justify hashing, attempt to look up hash functions for
+ * the operator.
+ */
+ if (sslot.nvalues + num_elems >= SCALARARRAY_MCV_HASH_THRESHOLD)
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVInHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVInHashContext hashContext;
+ double sumallcommon = 0.0,
+ remaining_selec = 0.0;
+ bool isdefault;
+ double otherdistinct;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ /*
+ * Build the hash table using the MCV array as keys, rather than the
+ * IN-list elements.
+ *
+ * The MCV list is guaranteed to contain distinct values. Hashing the
+ * MCVs allows us to efficiently identify whether a given IN-list
+ * element matches a most-common value, which is exactly what is
+ * needed for accurate selectivity estimation.
+ */
+ fmgr_info(hashLeft, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(sslot.valuetype,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVInHashTable_create(CurrentMemoryContext,
+ sslot.nvalues,
+ &hashContext);
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ {
+ bool found = false;
+ MCVInHashEntry *entry;
+
+ entry = MCVInHashTable_insert(hashTable, sslot.values[i], &found);
+
+ /*
+ * MCVHashTable_insert will only report "found" if the new value
+ * is equal to some previous one per datum_image_eq(). That
+ * probably shouldn't happen, since we're not expecting duplicates
+ * in the MCV list. If we do find a dup, just ignore it, leaving
+ * the hash entry's index pointing at the first occurrence. That
+ * matches the behavior that the non-hashed code path would have.
+ */
+ if (likely(!found))
+ entry->index = i;
+
+ sumallcommon += sslot.numbers[i];
+ }
+
+ /*
+ * Prepare to probe the hash table. If the probe values are of a
+ * different data type, then we need to change hash functions. (This
+ * code relies on the assumption that since we defined SH_STORE_HASH,
+ * simplehash.h will never need to compute hash values for existing
+ * hash table entries.)
+ */
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+
+ fmgr_info(hashRight, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ /*
+ * Comparison is against a constant that is neither NULL nor any of
+ * the common values. Its selectivity cannot be more than this:
+ */
+ remaining_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(remaining_selec);
+
+ /*
+ * and in fact it's probably a good deal less. We approximate that all
+ * the not-common values share this remaining fraction equally, so we
+ * divide by the number of other distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ remaining_selec /= otherdistinct;
+
+ /*
+ * Another cross-check: selectivity shouldn't be estimated as more
+ * than the least common "most common value".
+ */
+ if (sslot.nnumbers > 0 && remaining_selec > sslot.numbers[sslot.nnumbers - 1])
+ remaining_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /* Evaluate selectivity contribution of each IN-list element. */
+ for (int i = 0; i < num_elems; i++)
+ {
+ MCVInHashEntry *entry;
+ Selectivity s1;
+
+ /*
+ * If the constant is NULL, assume operator is strict and return
+ * s1 zero. (It's zero even for a negator op.)
+ */
+ if (elem_nulls[i])
+ continue;
+
+ if (IsA(other_op, Const))
+ {
+ entry = MCVInHashTable_lookup(hashTable, elem_values[i]);
+
+ if (entry != NULL)
+ {
+ /*
+ * As in the other code path, skip already-matched hash
+ * entries
+ */
+ s1 = sslot.numbers[entry->index];
+ }
+ else
+ s1 = remaining_selec;
+
+ if (isInequality)
+ s1 = 1.0 - s1 - nullfrac;
+ }
+ else
+ s1 = var_eq_non_const(vardata, operator, collation, other_op,
+ var_on_left, isInequality);
+
+ CLAMP_PROBABILITY(s1);
+
+ if (useOr)
+ {
+ selec = selec + s1 - selec * s1;
+ if (isEquality)
+ s1disjoint += s1;
+ }
+ else
+ {
+ selec = selec * s1;
+ if (isInequality)
+ s1disjoint += s1 - 1.0;
+ }
+ }
+
+ MCVInHashTable_destroy(hashTable);
+
+ /* accept disjoint-probability estimate if in range */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+
+ CLAMP_PROBABILITY(selec);
+ }
+
+ free_attstatsslot(&sslot);
+
+ return selec;
+}
+
+/*
+ * Support functions for the hash tables used by eqjoinsel_find_matches
+ */
+static uint32
+hash_mcv_in(MCVInHashTable_hash *tab, Datum key)
+{
+ MCVInHashContext *context = (MCVInHashContext *) tab->private_data;
+ FunctionCallInfo fcinfo = context->hash_fcinfo;
+ Datum fresult;
+
+ fcinfo->args[0].value = key;
+ fcinfo->isnull = false;
+ fresult = FunctionCallInvoke(fcinfo);
+ Assert(!fcinfo->isnull);
+ return DatumGetUInt32(fresult);
+}
+
+static bool
+mcvs_in_equal(MCVInHashTable_hash *tab, Datum key0, Datum key1)
+{
+ MCVInHashContext *context = (MCVInHashContext *) tab->private_data;
+
+ if (context->insert_mode)
+ {
+ /*
+ * During the insertion step, any comparisons will be between two
+ * Datums of the hash table's data type, so if the given operator is
+ * cross-type it will be the wrong thing to use. Fortunately, we can
+ * use datum_image_eq instead. The MCV values should all be distinct
+ * anyway, so it's mostly pro-forma to compare them at all.
+ */
+ return datum_image_eq(key0, key1,
+ context->hash_typbyval, context->hash_typlen);
+ }
+ else
+ {
+ FunctionCallInfo fcinfo = context->equal_fcinfo;
+ Datum fresult;
+
+ fcinfo->args[0].value = key0;
+ fcinfo->args[1].value = key1;
+ fcinfo->isnull = false;
+ fresult = FunctionCallInvoke(fcinfo);
+ return (!fcinfo->isnull && DatumGetBool(fresult));
+ }
+}
+
/*
* Estimate number of elements in the array yielded by an expression.
*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index ceb3fc5d980..17f5cc9476b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1666,6 +1666,9 @@ MBuf
MCVHashContext
MCVHashEntry
MCVHashTable_hash
+MCVInHashContext
+MCVInHashEntry
+MCVInHashTable_hash
MCVItem
MCVList
MEMORY_BASIC_INFORMATION
--
2.34.1
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-03-02 21:37 Zsolt Parragi <[email protected]>
parent: Ilia Evdokimov <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Zsolt Parragi @ 2026-03-02 21:37 UTC (permalink / raw)
To: Ilia Evdokimov <[email protected]>; +Cc: David Geier <[email protected]>; Chengpeng Yan <[email protected]>; Tatsuya Kawata <[email protected]>; [email protected] <[email protected]>
Hello!
+ if (vardata.isunique && vardata.rel && vardata.rel->tuples >= 1.0)
+ {
+ s2 = 1.0 / vardata.rel->tuples;
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+ if (isInequality)
+ s2 = 1.0 - s2 - stats->stanullfrac;
+ }
+ }
Isn't there's a corner case where this if order returns an incorrect
estimate/regression?
See the following test:
CREATE TABLE test AS SELECT generate_series(1, 1000) AS id;
CREATE UNIQUE INDEX ON test(id);
-- no ANALYZE
EXPLAIN SELECT * FROM test WHERE id <> ALL(ARRAY[1, 2, 3]);
-- Actual: rows=1
-- Expected: rows=997
ANALYZE test;
EXPLAIN SELECT * FROM test WHERE id <> ALL(ARRAY[1, 2, 3]);
-- Correct: rows=997
DROP TABLE test;
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-03-10 14:55 Ilia Evdokimov <[email protected]>
parent: Zsolt Parragi <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Ilia Evdokimov @ 2026-03-10 14:55 UTC (permalink / raw)
To: Zsolt Parragi <[email protected]>; +Cc: David Geier <[email protected]>; Chengpeng Yan <[email protected]>; Tatsuya Kawata <[email protected]>; [email protected] <[email protected]>
In the thread discussing ALL semantics and NULL [0], the question was
raised about adding a new regression test that checks selectivity
estimation. If the change gets committed, it would make sense to add
tests for this case as well.
Regarding the idea of optimizing the loop when all per-element
selectivities are the same: I ran some quick tests to see how much the
change in the v7-0002 patch affects planning time. Even without that
patch, iterating over an array with 50k elements takes about 30 ms.
```
CREATE TABLE t (val bytea PRIMARY KEY);
INSERT INTO t SELECT int4send(i) FROM generate_series(1,50000) AS i;
ANALYZE t;
SELECT n_distinct FROM pg_stats WHERE tablename = 't';
n_distinct
------------
-1
(1 row)
SELECT string_agg(format('int4send(%s)', i), ',') FROM
generate_series(1,50000) AS i \gset
EXPLAIN (SUMMARY) SELECT * FROM t WHERE val = ANY
(ARRAY[:string_agg]::bytea[]);
..........
Planning Time: 32.816 ms
(3 rows)
```
Given that, I don't see much benefit in adding additional logic here
just to avoid the loop. It would likely introduce extra code complexity
without a manful gain. If there is interest in optimization this case
further, I can revisit it and add the additional patch.
The patch v8 can still be reviewed as-is, and if the selectivity
regression test gets committed [0], I will add corresponding tests for
this change as well.
[0]:
https://www.postgresql.org/message-id/390a46f3-dbc4-4dc1-b49d-5cc61dd36026%40tantorlabs.com
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
[text/x-patch] v8-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch (18.9K, 2-v8-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch)
download | inline diff:
From 22be37fc625921dbb1722a18dde6b6e9da00890a Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <[email protected]>
Date: Tue, 10 Mar 2026 17:11:16 +0300
Subject: [PATCH v8] Use hash-based MCV matching for ScalarArrayOpExpr
selectivity
When estimating selectivity for ScalarArrayOpExpr (IN / ANY / ALL) with
available MCV statistics, the planner currently matches IN-list elements
against the MCV array using nested loops. For large IN-lists and/or large
MCV lists this leads to O(N*M) planning-time behavior.
This patch adds a hash-based matching strategy, similar to the one used
in join selectivity estimation. When MCV statistics are available and the
operator supports hashing, the smaller of the two inputs (MCV list or
IN-list constant elements) is chosen as the hash table build side, and
the other side is scanned once, reducing complexity to O(N+M).
The hash-based path is restricted to equality and inequality operators
that use eqsel()/neqsel(), and is applied only when suitable hash
functions and MCV statistics are available.
---
src/backend/utils/adt/selfuncs.c | 518 ++++++++++++++++++++++++++++++-
1 file changed, 513 insertions(+), 5 deletions(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index d4da0e8dea9..66c3e31eaa5 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -146,23 +146,27 @@
/*
* In production builds, switch to hash-based MCV matching when the lists are
* large enough to amortize hash setup cost. (This threshold is compared to
- * the sum of the lengths of the two MCV lists. This is simplistic but seems
+ * the sum of the lengths of the two lists. This is simplistic but seems
* to work well enough.) In debug builds, we use a smaller threshold so that
* the regression tests cover both paths well.
*/
#ifndef USE_ASSERT_CHECKING
-#define EQJOINSEL_MCV_HASH_THRESHOLD 200
+#define MCV_HASH_THRESHOLD 200
#else
-#define EQJOINSEL_MCV_HASH_THRESHOLD 20
+#define MCV_HASH_THRESHOLD 20
#endif
-/* Entries in the simplehash hash table used by eqjoinsel_find_matches */
+/*
+ * Entries in the simplehash hash table used by
+ * eqjoinsel_find_matches and scalararray_mcv_hash_match
+ */
typedef struct MCVHashEntry
{
Datum value; /* the value represented by this entry */
int index; /* its index in the relevant AttStatsSlot */
uint32 hash; /* hash code for the Datum */
char status; /* status code used by simplehash.h */
+ int count; /* number of occurrences of current value in */
} MCVHashEntry;
/* private_data for the simplehash hash table */
@@ -184,6 +188,16 @@ get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
+static double scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator,
+ Oid collation, Selectivity s2,
+ Datum *elem_values, bool *elem_nulls,
+ int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr,
+ bool isEquality, bool isInequality);
+static void accum_scalararray_prob(double individual_s, int count,
+ bool useOr, bool isEquality,
+ bool isInequality, double nullfrac,
+ double *selec, double *s1disjoint);
static double eqjoinsel_inner(FmgrInfo *eqproc, Oid collation,
Oid hashLeft, Oid hashRight,
VariableStatData *vardata1, VariableStatData *vardata2,
@@ -1893,6 +1907,36 @@ strip_array_coercion(Node *node)
return node;
}
+/*
+ * Accumulate the selectivity contribution of a single array element
+ * into the running ScalarArrayOpExpr selectivity estimate.
+ */
+static void
+accum_scalararray_prob(double individual_s, int count, bool useOr, bool isEquality,
+ bool isInequality, double nullfrac, double *selec, double *s1disjoint)
+{
+ if (count <= 0)
+ return;
+
+ if (isInequality)
+ individual_s = 1.0 - individual_s - nullfrac;
+
+ CLAMP_PROBABILITY(individual_s);
+
+ if (useOr)
+ {
+ *selec = 1.0 - (1.0 - *selec) * pow(1.0 - individual_s, count);
+ if (isEquality)
+ *s1disjoint += individual_s * count;
+ }
+ else
+ {
+ *selec = (*selec) * pow(individual_s, count);
+ if (isInequality)
+ *s1disjoint += count * (individual_s - 1.0);
+ }
+}
+
/*
* scalararraysel - Selectivity of ScalarArrayOpExpr Node.
*/
@@ -2025,6 +2069,36 @@ scalararraysel(PlannerInfo *root,
elmlen, elmbyval, elmalign,
&elem_values, &elem_nulls, &num_elems);
+ /* Try to avoid O(N^2) selectivity calculation for ScalarArrayOpExpr */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+ bool *elem_const = NULL;
+
+ /*
+ * If the clause is of the form "var OP something" or "something
+ * OP var", extract statistics for the variable. Otherwise, fall
+ * back to a default per-element estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ s1 = scalararray_mcv_hash_match(&vardata, operator,
+ clause->inputcollid, -1.0,
+ elem_values, elem_nulls,
+ num_elems, elem_const,
+ nominal_element_type, useOr,
+ isEquality, isInequality);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* For generic operators, we assume the probability of success is
* independent for each array element. But for "= ANY" or "<> ALL",
@@ -2100,6 +2174,100 @@ scalararraysel(PlannerInfo *root,
get_typlenbyval(arrayexpr->element_typeid,
&elmlen, &elmbyval);
+ /* Try to avoid O(N^2) selectivity calculation for ScalarArrayOpExpr */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+ int num_elems = list_length(arrayexpr->elements);
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then fall back to default code path to compute
+ * default selectivity.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ Selectivity nonconst_sel;
+ Datum *elem_values;
+ bool *elem_nulls;
+ bool *elem_const;
+ ListCell *lc;
+
+ /*
+ * Build arrays describing ARRAY[] elements: - elem_values:
+ * Datum value for Const elements - elem_nulls: whether
+ * element is NULL - elem_const: whether element is a Const
+ * node
+ */
+ elem_values = palloc_array(Datum, num_elems);
+ elem_nulls = palloc0_array(bool, num_elems);
+ elem_const = palloc0_array(bool, num_elems);
+
+ foreach(lc, arrayexpr->elements)
+ {
+ Node *elem_value = (Node *) lfirst(lc);
+ int i = foreach_current_index(lc);
+
+ if (IsA(elem_value, Const))
+ {
+ elem_values[i] = ((Const *) elem_value)->constvalue;
+ elem_nulls[i] = ((Const *) elem_value)->constisnull;
+ elem_const[i] = true;
+ }
+ else
+ {
+ elem_nulls[i] = false;
+ elem_const[i] = false;
+ }
+
+ /*
+ * For ALL semantics, if the array contains NULL, assume
+ * operator is strict. The ScalarArrayOpExpr cannot
+ * evaluate to TRUE, so return zero.
+ */
+ if (!useOr && elem_nulls[i])
+ {
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ return (Selectivity) 0.0;
+ }
+ }
+
+ /*
+ * Compute per-element selectivity via eqsel()/neqsel
+ * semantics.
+ */
+ nonconst_sel = var_eq_non_const(&vardata, operator,
+ clause->inputcollid,
+ other_op, var_on_left,
+ isInequality);
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator,
+ clause->inputcollid,
+ nonconst_sel, elem_values,
+ elem_nulls, num_elems,
+ elem_const,
+ nominal_element_type, useOr,
+ isEquality, isInequality);
+
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* We use the assumption of disjoint probabilities here too, although
* the odds of equal array elements are rather higher if the elements
@@ -2210,6 +2378,346 @@ scalararraysel(PlannerInfo *root,
return s1;
}
+/*
+ * Estimate selectivity of a ScalarArrayOpExpr (ANY/ALL) using MCV statistics
+ * with hash-based matching.
+ *
+ * This function follows the same probability model as the generic
+ * ScalarArrayOpExpr selectivity code (independent or disjoint probabilities
+ * for OR/AND combinations), but attempts to speed up matching between
+ * IN-list elements and the column's most-common-values (MCV) statistics by
+ * using hashing instead of nested loops.
+ *
+ * MCV statistics are used only to obtain per-value selectivities for
+ * constants that match MCV entries. All probabilities are combined using
+ * the standard ANY/ALL formulas, exactly as in the generic estimator.
+ *
+ * The function may return -1.0 to indicate that hash-based MCV estimation
+ * is not applicable (for example, missing statistics, unsupported operator,
+ * or unavailable hash functions), in which case the caller should fall back
+ * to the generic ScalarArrayOpExpr selectivity estimation.
+ *
+ * Inputs:
+ * vardata: statistics and metadata for the variable being estimated
+ * operator: equality or inequality operator to apply
+ * collation: OID of collation to use
+ * nonconst_sel: selectivity of non-const element
+ * elem_values: array of IN-list element values
+ * elem_nulls: array indicating which IN-list elements are NULL
+ * elem_const: array indicating which IN-list elements are Const nodes.
+ * array is NULL if all elemnets are const.
+ * num_elems: number of IN-list elements
+ * nominal_element_type: type of IN-list elements
+ * useOr: true if elements are combined using OR semantics, false for AND
+ * isEquality: true if the operator behaves like equality
+ * isInequality: true if the operator behaves like inequality
+ *
+ * Result:
+ * Selectivity estimate in the range [0.0, 1.0], or -1.0 if no estimate
+ * could be produced by this function.
+ *
+ * Note:
+ * This function assumes that the operator’s selectivity behavior matches
+ * eqsel()/neqsel semantics. It must not be used for operators with custom
+ * or non-standard selectivity behavior.
+ */
+static double
+scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator,
+ Oid collation, Selectivity nonconst_sel,
+ Datum *elem_values, bool *elem_nulls, int num_elems,
+ bool *elem_const, Oid nominal_element_type,
+ bool useOr, bool isEquality, bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+
+ /*
+ * If the variable is known to be unique, MCV statistics do not represent
+ * a meaningful frequency distribution, so skip MCV-based estimation.
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * For inequality (<>, ALL), we compute probabilities using the negated
+ * equality operator and later transform them as
+ *
+ * p(x <> c) = 1 - p(x = c) - nullfrac
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ /*
+ * If the MCV list and IN-list are large enough, and the operator
+ * supports hashing, attempt to use hash functions so that MCV–IN
+ * matching can be done in O(N+M) instead of O(N×M).
+ */
+ if (sslot.nvalues + num_elems >= MCV_HASH_THRESHOLD)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVHashContext hashContext;
+ double sumallcommon = 0.0,
+ nonmcv_selec = 0.0;
+ bool isdefault;
+ bool hash_mcv;
+ double otherdistinct;
+ Datum *arrayHash;
+ Datum *arrayProbe;
+ int nvaluesHash;
+ int nvaluesProbe;
+ int nonmcv_cnt = num_elems;
+ int nonconst_cnt = 0;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ sumallcommon += sslot.numbers[i];
+
+ /*
+ * Compute the total probability mass of all non-MCV values. This is
+ * the part of the column distribution not covered by MCVs.
+ */
+ nonmcv_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(nonmcv_selec);
+
+ /*
+ * Approximate the per-value probability of a non-MCV constant by
+ * dividing the remaining probability mass by the number of other
+ * distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ nonmcv_selec /= otherdistinct;
+
+ if (sslot.nnumbers > 0 &&
+ nonmcv_selec > sslot.numbers[sslot.nnumbers - 1])
+ {
+ nonmcv_selec = sslot.numbers[sslot.nnumbers - 1];
+ }
+
+ /* Make sure we build the hash table on the smaller array. */
+ if (sslot.nvalues <= num_elems)
+ {
+ hash_mcv = true;
+ nvaluesHash = sslot.nvalues;
+ nvaluesProbe = num_elems;
+ arrayHash = sslot.values;
+ arrayProbe = elem_values;
+ }
+ else
+ {
+ hash_mcv = false;
+ nvaluesHash = num_elems;
+ nvaluesProbe = sslot.nvalues;
+ arrayHash = elem_values;
+ arrayProbe = sslot.values;
+ }
+
+ fmgr_info(hash_mcv ? hashLeft : hashRight, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.op_is_reversed = hash_mcv;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(hash_mcv ? sslot.valuetype : nominal_element_type,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVHashTable_create(CurrentMemoryContext,
+ nvaluesHash,
+ &hashContext);
+
+ /* Build a hash table over the smaller input side. */
+ for (int i = 0; i < nvaluesHash; i++)
+ {
+ bool found = false;
+ MCVHashEntry *entry;
+
+ /*
+ * When hashing IN-list values (hash_mcv == false), we only insert
+ * constant, non-NULL elements. NULL and non-Const elements are
+ * counted separately, because they cannot participate in MCV
+ * matching and must be handled later using generic selectivity
+ * estimation.
+ */
+ if (!hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ nonmcv_cnt--;
+ continue;
+ }
+
+ if (elem_const != NULL && !elem_const[i])
+ {
+ nonmcv_cnt--;
+ nonconst_cnt++;
+ continue;
+ }
+ }
+
+ entry = MCVHashTable_insert(hashTable, arrayHash[i], &found);
+
+ /*
+ * entry->count tracks how many times the same value appears, so
+ * that duplicate IN-list elements can be folded into the
+ * probability calculation.
+ */
+ if (likely(!found))
+ {
+ entry->index = i;
+ entry->count = 1;
+ }
+ else
+ entry->count++;
+ }
+
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hash_mcv ? hashRight : hashLeft, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ for (int i = 0; i < nvaluesProbe; i++)
+ {
+ MCVHashEntry *entry;
+ Selectivity s1;
+ int nvaluesmcv;
+
+ /*
+ * When probing with IN-list elements, ignore NULLs and non-Const
+ * expressions: they cannot be matched against MCVs and will be
+ * accounted for later by generic estimation.
+ */
+ if (hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ nonmcv_cnt--;
+ continue;
+ }
+
+ if (elem_const != NULL && !elem_const[i])
+ {
+ nonmcv_cnt--;
+ nonconst_cnt++;
+ continue;
+ }
+ }
+
+ entry = MCVHashTable_lookup(hashTable, arrayProbe[i]);
+
+ /*
+ * If found, obtain its MCV frequency and remember how many values
+ * on the hashed side map to this entry.
+ */
+ if (entry != NULL)
+ {
+ s1 = hash_mcv ? sslot.numbers[entry->index]
+ : sslot.numbers[i];
+
+ nvaluesmcv = entry->count;
+
+ accum_scalararray_prob(s1, nvaluesmcv, useOr, isEquality,
+ isInequality, nullfrac, &selec,
+ &s1disjoint);
+
+ /* Matched values are no longer considered non-MCV */
+ nonmcv_cnt -= nvaluesmcv;
+ }
+ }
+
+ /*
+ * Account for constant IN-list values that did not match any MCV.
+ *
+ * Each such value is assumed to have probability = nonmcv_selec,
+ * derived from the remaining (non-MCV) probability mass.
+ */
+ accum_scalararray_prob(nonmcv_selec, nonmcv_cnt, useOr, isEquality,
+ isInequality, nullfrac, &selec, &s1disjoint);
+
+ /*
+ * Account for non-Const IN-list elements.
+ *
+ * These values cannot be matched against MCVs, so we rely on the
+ * operator's generic selectivity estimator for each of them.
+ */
+ accum_scalararray_prob(nonconst_sel, nonconst_cnt, useOr, isEquality,
+ isInequality, nullfrac, &selec, &s1disjoint);
+
+ /*
+ * For = ANY or <> ALL, if the IN-list elements are assumed distinct,
+ * the events are disjoint and the total probability is the sum of
+ * individual probabilities. Use that estimate if it lies in [0,1].
+ */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+
+ CLAMP_PROBABILITY(selec);
+
+ MCVHashTable_destroy(hashTable);
+ }
+
+ if (have_mcvs)
+ free_attstatsslot(&sslot);
+
+ return selec;
+}
+
/*
* Estimate number of elements in the array yielded by an expression.
*
@@ -2446,7 +2954,7 @@ eqjoinsel(PG_FUNCTION_ARGS)
* If the MCV lists are long enough to justify hashing, try to look up
* hash functions for the join operator.
*/
- if ((sslot1.nvalues + sslot2.nvalues) >= EQJOINSEL_MCV_HASH_THRESHOLD)
+ if ((sslot1.nvalues + sslot2.nvalues) >= MCV_HASH_THRESHOLD)
(void) get_op_hash_functions(operator, &hashLeft, &hashRight);
}
else
--
2.34.1
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-03-11 08:01 Zsolt Parragi <[email protected]>
parent: Ilia Evdokimov <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Zsolt Parragi @ 2026-03-11 08:01 UTC (permalink / raw)
To: Ilia Evdokimov <[email protected]>; +Cc: David Geier <[email protected]>; Chengpeng Yan <[email protected]>; Tatsuya Kawata <[email protected]>; [email protected] <[email protected]>
+ if (elem_nulls[i])
+ {
+ nonmcv_cnt--;
+ continue;
+ }
> The patch v8 can still be reviewed as-is, and if the selectivity
> regression test gets committed [0], I will add corresponding tests for
> this change as well.
Without [0], the const path will return incorrect results for <> ALL
and NULLs. Compared to that, the other path still has special handling
in it:
+ /*
+ * For ALL semantics, if the array contains NULL, assume
+ * operator is strict. The ScalarArrayOpExpr cannot
+ * evaluate to TRUE, so return zero.
+ */
+ nonconst_sel = var_eq_non_const(&vardata, operator,
+ clause->inputcollid,
+ other_op, var_on_left,
+ isInequality);
+ if (isInequality)
+ individual_s = 1.0 - individual_s - nullfrac;
Isn't this the double negation issue again, which was once
mentioned/fixed earlier?
+ int count; /* number of occurrences of current value in */
That's a truncated comment
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-03-20 15:58 Ilia Evdokimov <[email protected]>
parent: Zsolt Parragi <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Ilia Evdokimov @ 2026-03-20 15:58 UTC (permalink / raw)
To: Zsolt Parragi <[email protected]>; +Cc: David Geier <[email protected]>; Chengpeng Yan <[email protected]>; Tatsuya Kawata <[email protected]>; [email protected] <[email protected]>
On 3/11/26 11:01, Zsolt Parragi wrote:
> + /*
> + * For ALL semantics, if the array contains NULL, assume
> + * operator is strict. The ScalarArrayOpExpr cannot
> + * evaluate to TRUE, so return zero.
> + */
>
>
>
> + nonconst_sel = var_eq_non_const(&vardata, operator,
> + clause->inputcollid,
> + other_op, var_on_left,
> + isInequality);
>
> + if (isInequality)
> + individual_s = 1.0 - individual_s - nullfrac;
>
> Isn't this the double negation issue again, which was once
> mentioned/fixed earlier?
Right. I fixed it by using 'invert' for non-constant case. If there is a
more elegant way to structure this, suggestions are very welcome.
> + int count; /* number of occurrences of current value in */
>
> That's a truncated comment
Fixed.
After the commit c95cd29 I have rebased this patch. During the rebase, I
also add the NUL-handling path. In particular, I added an Assert(useOr)
in the relevant branch to document and enforce the expected execution flow.
Additionally after the 374a639 I prepared a set of regression-style
tests to verify that the selectivity estimates remain unchanged before
and after applying the patch. However, these tests rely on stable row
estimates from EXPLAIN, which are not guaranteed to be consistent across
platforms. For that reason, they are not suitable for inclusion in the
upstream test suite. I will keep these tests locally to validate
correctness before and after the patch.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
[text/x-patch] v9-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch (18.9K, 2-v9-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch)
download | inline diff:
From 5d93f945a022d38b3dd39c940ba620a3640bb236 Mon Sep 17 00:00:00 2001
From: Evdokimov Ilia <[email protected]>
Date: Fri, 20 Mar 2026 18:18:24 +0300
Subject: [PATCH v9] Use hash-based MCV matching for ScalarArrayOpExpr
selectivity
When estimating selectivity for ScalarArrayOpExpr (IN / ANY / ALL) with
available MCV statistics, the planner currently matches IN-list elements
against the MCV array using nested loops. For large IN-lists and/or large
MCV lists this leads to O(N*M) planning-time behavior.
This patch adds a hash-based matching strategy, similar to the one used
in join selectivity estimation. When MCV statistics are available and the
operator supports hashing, the smaller of the two inputs (MCV list or
IN-list constant elements) is chosen as the hash table build side, and
the other side is scanned once, reducing complexity to O(N+M).
The hash-based path is restricted to equality and inequality operators
that use eqsel()/neqsel(), and is applied only when suitable hash
functions and MCV statistics are available.
---
src/backend/utils/adt/selfuncs.c | 520 ++++++++++++++++++++++++++++++-
1 file changed, 515 insertions(+), 5 deletions(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 86b55c9bb8b..1d812162980 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -146,23 +146,27 @@
/*
* In production builds, switch to hash-based MCV matching when the lists are
* large enough to amortize hash setup cost. (This threshold is compared to
- * the sum of the lengths of the two MCV lists. This is simplistic but seems
+ * the sum of the lengths of the two lists. This is simplistic but seems
* to work well enough.) In debug builds, we use a smaller threshold so that
* the regression tests cover both paths well.
*/
#ifndef USE_ASSERT_CHECKING
-#define EQJOINSEL_MCV_HASH_THRESHOLD 200
+#define MCV_HASH_THRESHOLD 200
#else
-#define EQJOINSEL_MCV_HASH_THRESHOLD 20
+#define MCV_HASH_THRESHOLD 20
#endif
-/* Entries in the simplehash hash table used by eqjoinsel_find_matches */
+/*
+ * Entries in the simplehash hash table used by
+ * eqjoinsel_find_matches and scalararray_mcv_hash_match
+ */
typedef struct MCVHashEntry
{
Datum value; /* the value represented by this entry */
int index; /* its index in the relevant AttStatsSlot */
uint32 hash; /* hash code for the Datum */
char status; /* status code used by simplehash.h */
+ int count; /* number of occurrences of current value */
} MCVHashEntry;
/* private_data for the simplehash hash table */
@@ -184,6 +188,16 @@ get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
+static double scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator,
+ Oid collation, Selectivity nonconst_sel,
+ Datum *elem_values, bool *elem_nulls,
+ int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr,
+ bool isEquality, bool isInequality);
+static void accum_scalararray_prob(double s1, int count, bool useOr,
+ bool isEquality, bool isInequality,
+ double nullfrac, bool invert,
+ double *selec, double *s1disjoint);
static double eqjoinsel_inner(FmgrInfo *eqproc, Oid collation,
Oid hashLeft, Oid hashRight,
VariableStatData *vardata1, VariableStatData *vardata2,
@@ -1893,6 +1907,37 @@ strip_array_coercion(Node *node)
return node;
}
+/*
+ * Accumulate the selectivity contribution of a single array element
+ * into the running ScalarArrayOpExpr selectivity estimate.
+ */
+static void
+accum_scalararray_prob(double s1, int count, bool useOr, bool isEquality,
+ bool isInequality, double nullfrac, bool invert,
+ double *selec, double *s1disjoint)
+{
+ if (count <= 0)
+ return;
+
+ if (invert && isInequality)
+ s1 = 1.0 - s1 - nullfrac;
+
+ CLAMP_PROBABILITY(s1);
+
+ if (useOr)
+ {
+ *selec = 1.0 - (1.0 - *selec) * pow(1.0 - s1, count);
+ if (isEquality)
+ *s1disjoint += s1 * count;
+ }
+ else
+ {
+ *selec = (*selec) * pow(s1, count);
+ if (isInequality)
+ *s1disjoint += count * (s1 - 1.0);
+ }
+}
+
/*
* scalararraysel - Selectivity of ScalarArrayOpExpr Node.
*/
@@ -2034,6 +2079,36 @@ scalararraysel(PlannerInfo *root,
elmlen, elmbyval, elmalign,
&elem_values, &elem_nulls, &num_elems);
+ /* Try to avoid O(N^2) selectivity calculation for ScalarArrayOpExpr */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+ bool *elem_const = NULL;
+
+ /*
+ * If the clause is of the form "var OP something" or "something
+ * OP var", extract statistics for the variable. Otherwise, fall
+ * back to a default per-element estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ s1 = scalararray_mcv_hash_match(&vardata, operator,
+ clause->inputcollid, -1.0,
+ elem_values, elem_nulls,
+ num_elems, elem_const,
+ nominal_element_type, useOr,
+ isEquality, isInequality);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* For generic operators, we assume the probability of success is
* independent for each array element. But for "= ANY" or "<> ALL",
@@ -2109,6 +2184,100 @@ scalararraysel(PlannerInfo *root,
get_typlenbyval(arrayexpr->element_typeid,
&elmlen, &elmbyval);
+ /* Try to avoid O(N^2) selectivity calculation for ScalarArrayOpExpr */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+ int num_elems = list_length(arrayexpr->elements);
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then fall back to default code path to compute
+ * default selectivity.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ Selectivity nonconst_sel;
+ Datum *elem_values;
+ bool *elem_nulls;
+ bool *elem_const;
+ ListCell *lc;
+
+ /*
+ * Build arrays describing ARRAY[] elements: - elem_values:
+ * Datum value for Const elements - elem_nulls: whether
+ * element is NULL - elem_const: whether element is a Const
+ * node
+ */
+ elem_values = palloc_array(Datum, num_elems);
+ elem_nulls = palloc0_array(bool, num_elems);
+ elem_const = palloc0_array(bool, num_elems);
+
+ foreach(lc, arrayexpr->elements)
+ {
+ Node *elem_value = (Node *) lfirst(lc);
+ int i = foreach_current_index(lc);
+
+ if (IsA(elem_value, Const))
+ {
+ elem_values[i] = ((Const *) elem_value)->constvalue;
+ elem_nulls[i] = ((Const *) elem_value)->constisnull;
+ elem_const[i] = true;
+ }
+ else
+ {
+ elem_nulls[i] = false;
+ elem_const[i] = false;
+ }
+
+ /*
+ * When the array contains a NULL constant, same as var_eq_const,
+ * we assume the operator is strict and nothing will match, thus
+ * return 0.0.
+ */
+ if (!useOr && elem_nulls[i])
+ {
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ return (Selectivity) 0.0;
+ }
+ }
+
+ /*
+ * Compute per-element selectivity via eqsel()/neqsel
+ * semantics.
+ */
+ nonconst_sel = var_eq_non_const(&vardata, operator,
+ clause->inputcollid,
+ other_op, var_on_left,
+ isInequality);
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator,
+ clause->inputcollid,
+ nonconst_sel, elem_values,
+ elem_nulls, num_elems,
+ elem_const,
+ nominal_element_type, useOr,
+ isEquality, isInequality);
+
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* We use the assumption of disjoint probabilities here too, although
* the odds of equal array elements are rather higher if the elements
@@ -2227,6 +2396,347 @@ scalararraysel(PlannerInfo *root,
return s1;
}
+/*
+ * Estimate selectivity of a ScalarArrayOpExpr (ANY/ALL) using MCV statistics
+ * with hash-based matching.
+ *
+ * This function follows the same probability model as the generic
+ * ScalarArrayOpExpr selectivity code (independent or disjoint probabilities
+ * for OR/AND combinations), but attempts to speed up matching between
+ * IN-list elements and the column's most-common-values (MCV) statistics by
+ * using hashing instead of nested loops.
+ *
+ * MCV statistics are used only to obtain per-value selectivities for
+ * constants that match MCV entries. All probabilities are combined using
+ * the standard ANY/ALL formulas, exactly as in the generic estimator.
+ *
+ * The function may return -1.0 to indicate that hash-based MCV estimation
+ * is not applicable (for example, missing statistics, unsupported operator,
+ * or unavailable hash functions), in which case the caller should fall back
+ * to the generic ScalarArrayOpExpr selectivity estimation.
+ *
+ * Inputs:
+ * vardata: statistics and metadata for the variable being estimated
+ * operator: equality or inequality operator to apply
+ * collation: OID of collation to use
+ * nonconst_sel: selectivity of non-const element
+ * elem_values: array of IN-list element values
+ * elem_nulls: array indicating which IN-list elements are NULL
+ * elem_const: array indicating which IN-list elements are Const nodes.
+ * array is NULL if all elemnets are const.
+ * num_elems: number of IN-list elements
+ * nominal_element_type: type of IN-list elements
+ * useOr: true if elements are combined using OR semantics, false for AND
+ * isEquality: true if the operator behaves like equality
+ * isInequality: true if the operator behaves like inequality
+ *
+ * Result:
+ * Selectivity estimate in the range [0.0, 1.0], or -1.0 if no estimate
+ * could be produced by this function.
+ *
+ * Note:
+ * This function assumes that the operator’s selectivity behavior matches
+ * eqsel()/neqsel semantics. It must not be used for operators with custom
+ * or non-standard selectivity behavior.
+ */
+static double
+scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator,
+ Oid collation, Selectivity nonconst_sel,
+ Datum *elem_values, bool *elem_nulls, int num_elems,
+ bool *elem_const, Oid nominal_element_type,
+ bool useOr, bool isEquality, bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+
+ /*
+ * If the variable is known to be unique, MCV statistics do not represent
+ * a meaningful frequency distribution, so skip MCV-based estimation.
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * For inequality (<>, ALL), we compute probabilities using the negated
+ * equality operator and later transform them as
+ *
+ * p(x <> c) = 1 - p(x = c) - nullfrac
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ /*
+ * If the MCV list and IN-list are large enough, and the operator
+ * supports hashing, attempt to use hash functions so that MCV–IN
+ * matching can be done in O(N+M) instead of O(N×M).
+ */
+ if (sslot.nvalues + num_elems >= MCV_HASH_THRESHOLD)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVHashContext hashContext;
+ double sumallcommon = 0.0,
+ nonmcv_selec = 0.0;
+ bool isdefault;
+ bool hash_mcv;
+ double otherdistinct;
+ Datum *arrayHash;
+ Datum *arrayProbe;
+ int nvaluesHash;
+ int nvaluesProbe;
+ int nonmcv_cnt = num_elems;
+ int nonconst_cnt = 0;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ sumallcommon += sslot.numbers[i];
+
+ /*
+ * Compute the total probability mass of all non-MCV values. This is
+ * the part of the column distribution not covered by MCVs.
+ */
+ nonmcv_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(nonmcv_selec);
+
+ /*
+ * Approximate the per-value probability of a non-MCV constant by
+ * dividing the remaining probability mass by the number of other
+ * distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ nonmcv_selec /= otherdistinct;
+
+ if (sslot.nnumbers > 0 && nonmcv_selec > sslot.numbers[sslot.nnumbers - 1])
+ nonmcv_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /* Make sure we build the hash table on the smaller array. */
+ if (sslot.nvalues <= num_elems)
+ {
+ hash_mcv = true;
+ nvaluesHash = sslot.nvalues;
+ nvaluesProbe = num_elems;
+ arrayHash = sslot.values;
+ arrayProbe = elem_values;
+ }
+ else
+ {
+ hash_mcv = false;
+ nvaluesHash = num_elems;
+ nvaluesProbe = sslot.nvalues;
+ arrayHash = elem_values;
+ arrayProbe = sslot.values;
+ }
+
+ fmgr_info(hash_mcv ? hashLeft : hashRight, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.op_is_reversed = hash_mcv;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(hash_mcv ? sslot.valuetype : nominal_element_type,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVHashTable_create(CurrentMemoryContext,
+ nvaluesHash,
+ &hashContext);
+
+ /* Build a hash table over the smaller input side. */
+ for (int i = 0; i < nvaluesHash; i++)
+ {
+ bool found = false;
+ MCVHashEntry *entry;
+
+ /*
+ * When hashing IN-list values (hash_mcv == false), we only insert
+ * constant, non-NULL elements. NULL and non-Const elements are
+ * counted separately, because they cannot participate in MCV
+ * matching and must be handled later using generic selectivity
+ * estimation.
+ */
+ if (!hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ Assert(useOr);
+ nonmcv_cnt--;
+ continue;
+ }
+
+ if (elem_const != NULL && !elem_const[i])
+ {
+ nonmcv_cnt--;
+ nonconst_cnt++;
+ continue;
+ }
+ }
+
+ entry = MCVHashTable_insert(hashTable, arrayHash[i], &found);
+
+ /*
+ * entry->count tracks how many times the same value appears, so
+ * that duplicate IN-list elements can be folded into the
+ * probability calculation.
+ */
+ if (likely(!found))
+ {
+ entry->index = i;
+ entry->count = 1;
+ }
+ else
+ entry->count++;
+ }
+
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hash_mcv ? hashRight : hashLeft, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ for (int i = 0; i < nvaluesProbe; i++)
+ {
+ MCVHashEntry *entry;
+ Selectivity s1;
+ int nvaluesmcv;
+
+ /*
+ * When probing with IN-list elements, ignore NULLs and non-Const
+ * expressions: they cannot be matched against MCVs and will be
+ * accounted for later by generic estimation.
+ */
+ if (hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ Assert(useOr);
+ nonmcv_cnt--;
+ continue;
+ }
+
+ if (elem_const != NULL && !elem_const[i])
+ {
+ nonmcv_cnt--;
+ nonconst_cnt++;
+ continue;
+ }
+ }
+
+ entry = MCVHashTable_lookup(hashTable, arrayProbe[i]);
+
+ /*
+ * If found, obtain its MCV frequency and remember how many values
+ * on the hashed side map to this entry.
+ */
+ if (entry != NULL)
+ {
+ s1 = hash_mcv ? sslot.numbers[entry->index]
+ : sslot.numbers[i];
+
+ nvaluesmcv = entry->count;
+
+ accum_scalararray_prob(s1, nvaluesmcv, useOr, isEquality,
+ isInequality, nullfrac, true, &selec,
+ &s1disjoint);
+
+ /* Matched values are no longer considered non-MCV */
+ nonmcv_cnt -= nvaluesmcv;
+ }
+ }
+
+ /*
+ * Account for constant IN-list values that did not match any MCV.
+ *
+ * Each such value is assumed to have probability = nonmcv_selec,
+ * derived from the remaining (non-MCV) probability mass.
+ */
+ accum_scalararray_prob(nonmcv_selec, nonmcv_cnt, useOr, isEquality,
+ isInequality, nullfrac, true,
+ &selec, &s1disjoint);
+
+ /*
+ * Account for non-Const IN-list elements.
+ *
+ * These values cannot be matched against MCVs, so we rely on the
+ * operator's generic selectivity estimator for each of them.
+ */
+ accum_scalararray_prob(nonconst_sel, nonconst_cnt, useOr, isEquality,
+ isInequality, nullfrac, false,
+ &selec, &s1disjoint);
+
+ /*
+ * For = ANY or <> ALL, if the IN-list elements are assumed distinct,
+ * the events are disjoint and the total probability is the sum of
+ * individual probabilities. Use that estimate if it lies in [0,1].
+ */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+
+ CLAMP_PROBABILITY(selec);
+
+ MCVHashTable_destroy(hashTable);
+ }
+
+ if (have_mcvs)
+ free_attstatsslot(&sslot);
+
+ return selec;
+}
+
/*
* Estimate number of elements in the array yielded by an expression.
*
@@ -2463,7 +2973,7 @@ eqjoinsel(PG_FUNCTION_ARGS)
* If the MCV lists are long enough to justify hashing, try to look up
* hash functions for the join operator.
*/
- if ((sslot1.nvalues + sslot2.nvalues) >= EQJOINSEL_MCV_HASH_THRESHOLD)
+ if ((sslot1.nvalues + sslot2.nvalues) >= MCV_HASH_THRESHOLD)
(void) get_op_hash_functions(operator, &hashLeft, &hashRight);
}
else
--
2.34.1
[text/x-patch] hash_based_any_tests.patch (14.1K, 3-hash_based_any_tests.patch)
download | inline diff:
diff --git a/src/test/regress/expected/planner_est.out b/src/test/regress/expected/planner_est.out
index b62a47552fa..3ef720908f5 100644
--- a/src/test/regress/expected/planner_est.out
+++ b/src/test/regress/expected/planner_est.out
@@ -210,4 +210,213 @@ false, true, false, true);
-> Result (cost=N..N rows=1 width=N)
(4 rows)
+-- Ensure stable and rich MCV statistics
+SET default_statistics_target = 1000;
+CREATE TABLE t_mcv (a int);
+-- Build ~100 MCV values with uniform distribution
+INSERT INTO t_mcv
+SELECT (i % 100)
+FROM generate_series(1, 20000) s(i);
+ANALYZE t_mcv;
+-- =========================================================
+-- CASE 1: Large ANY list (MCV < ANY) → hash on MCV
+-- =========================================================
+-- 1. Single element
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY(SELECT 1));$$,
+false, true, false, true);
+ explain_mask_costs
+--------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=1912 width=N)
+ Filter: (a = ANY ((InitPlan array_1).col1))
+ InitPlan array_1
+ -> Result (cost=N..N rows=1 width=N)
+(4 rows)
+
+-- 2. Multiple elements (all in MCV)
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY(SELECT i FROM generate_series(1,3) s(i)));$$,
+false, true, false, true);
+ explain_mask_costs
+------------------------------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=1912 width=N)
+ Filter: (a = ANY ((InitPlan array_1).col1))
+ InitPlan array_1
+ -> Function Scan on generate_series s (cost=N..N rows=3 width=N)
+(4 rows)
+
+-- 3. Includes non-MCV values
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[1, 2, 1000]);$$,
+false, true, false, true);
+ explain_mask_costs
+-------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=400 width=N)
+ Filter: (a = ANY ('{1,2,1000}'::integer[]))
+(2 rows)
+
+-- 4. Duplicates
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY(SELECT (i % 3) + 1 FROM generate_series(1,30) s(i)));$$,
+false, true, false, true);
+ explain_mask_costs
+-------------------------------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=1912 width=N)
+ Filter: (a = ANY ((InitPlan array_1).col1))
+ InitPlan array_1
+ -> Function Scan on generate_series s (cost=N..N rows=30 width=N)
+(4 rows)
+
+-- =========================================================
+-- CASE 2: Small ANY list (ANY < MCV) → hash on ANY
+-- =========================================================
+-- 1. Single element
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[10]);$$,
+false, true, false, true);
+ explain_mask_costs
+-------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=200 width=N)
+ Filter: (a = ANY ('{10}'::integer[]))
+(2 rows)
+
+-- 2. Multiple elements (all in MCV)
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[10,20,30]);$$,
+false, true, false, true);
+ explain_mask_costs
+-------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=600 width=N)
+ Filter: (a = ANY ('{10,20,30}'::integer[]))
+(2 rows)
+
+-- 3. Includes non-MCV values
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[10,20,10000]);$$,
+false, true, false, true);
+ explain_mask_costs
+--------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=400 width=N)
+ Filter: (a = ANY ('{10,20,10000}'::integer[]))
+(2 rows)
+
+-- 4. Duplicates
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[10,10,10,20,20]);$$,
+false, true, false, true);
+ explain_mask_costs
+-----------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=1000 width=N)
+ Filter: (a = ANY ('{10,10,10,20,20}'::integer[]))
+(2 rows)
+
+-- =========================================================
+-- CASE 3: Guaranteed large case → stress hash path
+-- =========================================================
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY(SELECT i % 100 FROM generate_series(1,500) s(i)));$$,
+false, true, false, true);
+ explain_mask_costs
+--------------------------------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=1912 width=N)
+ Filter: (a = ANY ((InitPlan array_1).col1))
+ InitPlan array_1
+ -> Function Scan on generate_series s (cost=N..N rows=500 width=N)
+(4 rows)
+
+-- =========================================================
+-- CASE 4: inequality (<> ALL)
+-- =========================================================
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a <> ALL (ARRAY[1,2,3]);$$,
+false, true, false, true);
+ explain_mask_costs
+---------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=19400 width=N)
+ Filter: (a <> ALL ('{1,2,3}'::integer[]))
+(2 rows)
+
+-- =========================================================
+-- CASE 5: mix const + non-const
+-- =========================================================
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[1,2] || ARRAY(SELECT i FROM generate_series(3,5) s(i)));$$,
+false, true, false, true);
+ explain_mask_costs
+------------------------------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=1912 width=N)
+ Filter: (a = ANY (('{1,2}'::integer[] || (InitPlan array_1).col1)))
+ InitPlan array_1
+ -> Function Scan on generate_series s (cost=N..N rows=3 width=N)
+(4 rows)
+
+-- =========================================================
+-- CASE 6: NULL handling
+-- =========================================================
+-- ANY with NULL
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[NULL,1]);$$,
+false, true, false, true);
+ explain_mask_costs
+-------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=200 width=N)
+ Filter: (a = ANY ('{NULL,1}'::integer[]))
+(2 rows)
+
+-- ALL with NULL (should be 0 selectivity)
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ALL (ARRAY[1,NULL]);$$,
+false, true, false, true);
+ explain_mask_costs
+-----------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=1 width=N)
+ Filter: (a = ALL ('{1,NULL}'::integer[]))
+(2 rows)
+
+-- =========================================================
+-- CASE 7: Combined all of them
+-- =========================================================
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[1, 2, 3, 1000, 2000, NULL, 1, 1, 2, (SELECT 4), (SELECT 5), (SELECT 10000), (SELECT 4), (SELECT 4)] || ARRAY( SELECT i % 120 FROM generate_series(1, 500) s(i)));$$,
+false, true, false, true);
+ explain_mask_costs
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=1912 width=N)
+ Filter: (a = ANY ((ARRAY[1, 2, 3, 1000, 2000, NULL::integer, 1, 1, 2, (InitPlan expr_1).col1, (InitPlan expr_2).col1, (InitPlan expr_3).col1, (InitPlan expr_4).col1, (InitPlan expr_5).col1] || (InitPlan array_1).col1)))
+ InitPlan expr_1
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan expr_2
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan expr_3
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan expr_4
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan expr_5
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan array_1
+ -> Function Scan on generate_series s (cost=N..N rows=500 width=N)
+(14 rows)
+
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a <> ALL (ARRAY[1, 2, 3, 1000, 2000, NULL, 1, 1, 2, (SELECT 4), (SELECT 5), (SELECT 10000), (SELECT 4), (SELECT 4)] || ARRAY( SELECT i % 120 FROM generate_series(1, 500) s(i)));$$,
+false, true, false, true);
+ explain_mask_costs
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Seq Scan on t_mcv (cost=N..N rows=18088 width=N)
+ Filter: (a <> ALL ((ARRAY[1, 2, 3, 1000, 2000, NULL::integer, 1, 1, 2, (InitPlan expr_1).col1, (InitPlan expr_2).col1, (InitPlan expr_3).col1, (InitPlan expr_4).col1, (InitPlan expr_5).col1] || (InitPlan array_1).col1)))
+ InitPlan expr_1
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan expr_2
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan expr_3
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan expr_4
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan expr_5
+ -> Result (cost=N..N rows=1 width=N)
+ InitPlan array_1
+ -> Function Scan on generate_series s (cost=N..N rows=500 width=N)
+(14 rows)
+
+DROP TABLE t_mcv;
DROP FUNCTION explain_mask_costs(text, bool, bool, bool, bool);
diff --git a/src/test/regress/sql/planner_est.sql b/src/test/regress/sql/planner_est.sql
index 53210d5baad..ba8f8bd8fb6 100644
--- a/src/test/regress/sql/planner_est.sql
+++ b/src/test/regress/sql/planner_est.sql
@@ -147,4 +147,116 @@ SELECT explain_mask_costs($$
SELECT * FROM tenk1 WHERE unique1 <> ALL (ARRAY[1, 2, 98, (SELECT 99), NULL]);$$,
false, true, false, true);
+-- Ensure stable and rich MCV statistics
+SET default_statistics_target = 1000;
+
+CREATE TABLE t_mcv (a int);
+
+-- Build ~100 MCV values with uniform distribution
+INSERT INTO t_mcv
+SELECT (i % 100)
+FROM generate_series(1, 20000) s(i);
+
+ANALYZE t_mcv;
+
+-- =========================================================
+-- CASE 1: Large ANY list (MCV < ANY) → hash on MCV
+-- =========================================================
+
+-- 1. Single element
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY(SELECT 1));$$,
+false, true, false, true);
+
+-- 2. Multiple elements (all in MCV)
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY(SELECT i FROM generate_series(1,3) s(i)));$$,
+false, true, false, true);
+
+-- 3. Includes non-MCV values
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[1, 2, 1000]);$$,
+false, true, false, true);
+
+-- 4. Duplicates
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY(SELECT (i % 3) + 1 FROM generate_series(1,30) s(i)));$$,
+false, true, false, true);
+
+-- =========================================================
+-- CASE 2: Small ANY list (ANY < MCV) → hash on ANY
+-- =========================================================
+
+-- 1. Single element
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[10]);$$,
+false, true, false, true);
+
+-- 2. Multiple elements (all in MCV)
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[10,20,30]);$$,
+false, true, false, true);
+
+-- 3. Includes non-MCV values
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[10,20,10000]);$$,
+false, true, false, true);
+
+-- 4. Duplicates
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[10,10,10,20,20]);$$,
+false, true, false, true);
+
+-- =========================================================
+-- CASE 3: Guaranteed large case → stress hash path
+-- =========================================================
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY(SELECT i % 100 FROM generate_series(1,500) s(i)));$$,
+false, true, false, true);
+
+-- =========================================================
+-- CASE 4: inequality (<> ALL)
+-- =========================================================
+
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a <> ALL (ARRAY[1,2,3]);$$,
+false, true, false, true);
+
+-- =========================================================
+-- CASE 5: mix const + non-const
+-- =========================================================
+
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[1,2] || ARRAY(SELECT i FROM generate_series(3,5) s(i)));$$,
+false, true, false, true);
+
+-- =========================================================
+-- CASE 6: NULL handling
+-- =========================================================
+
+-- ANY with NULL
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[NULL,1]);$$,
+false, true, false, true);
+
+-- ALL with NULL (should be 0 selectivity)
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ALL (ARRAY[1,NULL]);$$,
+false, true, false, true);
+
+-- =========================================================
+-- CASE 7: Combined all of them
+-- =========================================================
+
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a = ANY (ARRAY[1, 2, 3, 1000, 2000, NULL, 1, 1, 2, (SELECT 4), (SELECT 5), (SELECT 10000), (SELECT 4), (SELECT 4)] || ARRAY( SELECT i % 120 FROM generate_series(1, 500) s(i)));$$,
+false, true, false, true);
+
+SELECT explain_mask_costs($$
+SELECT * FROM t_mcv WHERE a <> ALL (ARRAY[1, 2, 3, 1000, 2000, NULL, 1, 1, 2, (SELECT 4), (SELECT 5), (SELECT 10000), (SELECT 4), (SELECT 4)] || ARRAY( SELECT i % 120 FROM generate_series(1, 500) s(i)));$$,
+false, true, false, true);
+
+DROP TABLE t_mcv;
+
+
DROP FUNCTION explain_mask_costs(text, bool, bool, bool, bool);
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-04-08 16:48 Ilia Evdokimov <[email protected]>
parent: Ilia Evdokimov <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: Ilia Evdokimov @ 2026-04-08 16:48 UTC (permalink / raw)
To: Zsolt Parragi <[email protected]>; David Geier <[email protected]>; Chengpeng Yan <[email protected]>; Tatsuya Kawata <[email protected]>; +Cc: [email protected] <[email protected]>
I rebased the previous patch after it was marked as "Need rebase"
I also initialized the 'elem_cost' array to all 'true' values to
simplify the code and avoid confusion, and rewrote
accum_scalararray_prob() to improve readability.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
[text/x-patch] v10-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExp.patch (19.4K, 2-v10-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExp.patch)
download | inline diff:
From 9b931efdcbe405eab1d58f2a215cb296ab145a59 Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <[email protected]>
Date: Wed, 8 Apr 2026 19:45:00 +0300
Subject: [PATCH v10] Use hash-based MCV matching for ScalarArrayOpExpr
selectivity
When estimating selectivity for ScalarArrayOpExpr (IN / ANY / ALL) with
available MCV statistics, the planner currently matches IN-list elements
against the MCV array using nested loops. For large IN-lists and/or large
MCV lists this leads to O(N*M) planning-time behavior.
This patch adds a hash-based matching strategy, similar to the one used
in join selectivity estimation. When MCV statistics are available and the
operator supports hashing, the smaller of the two inputs (MCV list or
IN-list constant elements) is chosen as the hash table build side, and
the other side is scanned once, reducing complexity to O(N+M).
The hash-based path is restricted to equality and inequality operators
that use eqsel()/neqsel(), and is applied only when suitable hash
functions and MCV statistics are available.
---
src/backend/utils/adt/selfuncs.c | 547 ++++++++++++++++++++++++++++++-
1 file changed, 542 insertions(+), 5 deletions(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 4160d2d6e24..af12071ed0e 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -146,23 +146,27 @@
/*
* In production builds, switch to hash-based MCV matching when the lists are
* large enough to amortize hash setup cost. (This threshold is compared to
- * the sum of the lengths of the two MCV lists. This is simplistic but seems
+ * the sum of the lengths of the two lists. This is simplistic but seems
* to work well enough.) In debug builds, we use a smaller threshold so that
* the regression tests cover both paths well.
*/
#ifndef USE_ASSERT_CHECKING
-#define EQJOINSEL_MCV_HASH_THRESHOLD 200
+#define MCV_HASH_THRESHOLD 200
#else
-#define EQJOINSEL_MCV_HASH_THRESHOLD 20
+#define MCV_HASH_THRESHOLD 20
#endif
-/* Entries in the simplehash hash table used by eqjoinsel_find_matches */
+/*
+ * Entries in the simplehash hash table used by
+ * eqjoinsel_find_matches and scalararray_mcv_hash_match
+ */
typedef struct MCVHashEntry
{
Datum value; /* the value represented by this entry */
int index; /* its index in the relevant AttStatsSlot */
uint32 hash; /* hash code for the Datum */
char status; /* status code used by simplehash.h */
+ int count; /* number of occurrences of current value */
} MCVHashEntry;
/* private_data for the simplehash hash table */
@@ -184,6 +188,16 @@ get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
+static double scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator,
+ Oid collation, Selectivity nonconst_sel,
+ Datum *elem_values, bool *elem_nulls,
+ int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr,
+ bool isEquality, bool isInequality);
+static void accum_scalararray_prob(Selectivity elem_sel, int count, bool useOr,
+ bool isEquality, bool isInequality,
+ double nullfrac, bool invert,
+ Selectivity *p_selec, Selectivity *p_selec_disjoint);
static double eqjoinsel_inner(FmgrInfo *eqproc, Oid collation,
Oid hashLeft, Oid hashRight,
VariableStatData *vardata1, VariableStatData *vardata2,
@@ -1893,6 +1907,67 @@ strip_array_coercion(Node *node)
return node;
}
+/*
+ * accum_scalararray_prob - combine selectivity for repeated elements
+ *
+ * Update the running selectivity for a ScalarArrayOpExpr by adding the
+ * contribution of 'count' identical elements with per-element selectivity.
+ *
+ * This is equivalent to applying the per-element update 'count' times:
+ *
+ * OR (ANY): P = P + s - P*s
+ * AND (ALL): P = P * s
+ *
+ * but uses closed-form formulas:
+ *
+ * OR: P = 1 - (1 - P) * (1 - s)^count
+ * AND: P = P * s^count
+ *
+ * The selec_disjoint accumulator tracks the alternative "disjoint events"
+ * estimate used for "= ANY" / "<> ALL".
+ */
+static void
+accum_scalararray_prob(Selectivity elem_sel, int count, bool useOr,
+ bool isEquality, bool isInequality,
+ double nullfrac, bool invert,
+ Selectivity *p_selec, Selectivity *p_selec_disjoint)
+{
+ Selectivity selec;
+ Selectivity disjoint;
+
+ if (count <= 0)
+ return;
+
+ /* Convert to inequality probability if needed */
+ if (invert && isInequality)
+ elem_sel = 1.0 - elem_sel - nullfrac;
+
+ CLAMP_PROBABILITY(elem_sel);
+
+ selec = *p_selec;
+ disjoint = *p_selec_disjoint;
+
+ if (useOr)
+ {
+ /* ANY semantics: probability that at least one element matches */
+ selec = 1.0 - (1.0 - selec) * pow(1.0 - elem_sel, count);
+
+ if (isEquality)
+ disjoint += elem_sel * count;
+ }
+ else
+ {
+ /* ALL semantics: probability that all elements match */
+ selec *= pow(elem_sel, count);
+
+ if (isInequality)
+ disjoint += count * (elem_sel - 1.0);
+ }
+
+ *p_selec = selec;
+ *p_selec_disjoint = disjoint;
+}
+
/*
* scalararraysel - Selectivity of ScalarArrayOpExpr Node.
*/
@@ -2034,6 +2109,45 @@ scalararraysel(PlannerInfo *root,
elmlen, elmbyval, elmalign,
&elem_values, &elem_nulls, &num_elems);
+ /* Try to avoid O(N^2) selectivity calculation */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If the clause is of the form "var OP something" or "something
+ * OP var", extract statistics for the variable. Otherwise, fall
+ * back to a default per-element estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ bool *elem_const = palloc_array(bool, num_elems);
+
+ /*
+ * All elements are constants here, since we deconstructed a
+ * Const array.
+ */
+ memset(elem_const, true, sizeof(bool) * num_elems);
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator,
+ clause->inputcollid, -1.0,
+ elem_values, elem_nulls,
+ num_elems, elem_const,
+ nominal_element_type, useOr,
+ isEquality, isInequality);
+
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* For generic operators, we assume the probability of success is
* independent for each array element. But for "= ANY" or "<> ALL",
@@ -2109,6 +2223,95 @@ scalararraysel(PlannerInfo *root,
get_typlenbyval(arrayexpr->element_typeid,
&elmlen, &elmbyval);
+ /* Try to avoid O(N^2) selectivity calculation */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+ int num_elems = list_length(arrayexpr->elements);
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then fall back to default code path to compute
+ * default selectivity.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ Selectivity nonconst_sel;
+ Datum *elem_values;
+ bool *elem_nulls;
+ bool *elem_const;
+ ListCell *lc;
+
+ /* Build arrays describing ARRAY[] elements */
+ elem_values = palloc_array(Datum, num_elems);
+ elem_nulls = palloc0_array(bool, num_elems);
+ elem_const = palloc0_array(bool, num_elems);
+
+ foreach(lc, arrayexpr->elements)
+ {
+ Node *elem_value = (Node *) lfirst(lc);
+ int i = foreach_current_index(lc);
+
+ if (IsA(elem_value, Const))
+ {
+ elem_values[i] = ((Const *) elem_value)->constvalue;
+ elem_nulls[i] = ((Const *) elem_value)->constisnull;
+ elem_const[i] = true;
+ }
+ else
+ {
+ elem_nulls[i] = false;
+ elem_const[i] = false;
+ }
+
+ /*
+ * When the array contains a NULL constant, same as
+ * var_eq_const, we assume the operator is strict and
+ * nothing will match, thus return 0.0.
+ */
+ if (!useOr && elem_nulls[i])
+ {
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ return (Selectivity) 0.0;
+ }
+ }
+
+ /*
+ * Non-Const elements cannot be matched against MCV entries so
+ * estimate their selectivity separately using a fallback.
+ */
+ nonconst_sel = var_eq_non_const(&vardata, operator,
+ clause->inputcollid,
+ other_op, var_on_left,
+ isInequality);
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator,
+ clause->inputcollid,
+ nonconst_sel, elem_values,
+ elem_nulls, num_elems,
+ elem_const,
+ nominal_element_type, useOr,
+ isEquality, isInequality);
+
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* We use the assumption of disjoint probabilities here too, although
* the odds of equal array elements are rather higher if the elements
@@ -2227,6 +2430,339 @@ scalararraysel(PlannerInfo *root,
return s1;
}
+/*
+ * scalararray_mcv_hash_match - Selectivity of ScalarArrayOpExpr by O(N+M)
+ *
+ * This function matches IN-list elements against MCV entries of the variable,
+ * using either hashing (O(N+M)) or fallback nested-loop logic. For matched
+ * elements, selectivity is taken from MCV frequencies; unmatched elements are
+ * handled using fallback estimates.
+ *
+ * The resulting probabilities are combined using the standard ANY/ALL
+ * selectivity model (independent or disjoint events), identical to the
+ * generic estimator.
+ *
+ * Inputs:
+ * vardata: statistics for the variable
+ * operator: equality or inequality operator
+ * collation: collation to use
+ * nonconst_sel: fallback selectivity for non-Const elements
+ * elem_values: IN-list element values
+ * elem_nulls: NULL flags for elements
+ * elem_const: flags indicating Const elements
+ * num_elems: number of elements
+ * nominal_element_type: element type
+ * useOr: OR (ANY) vs AND (ALL) semantics
+ * isEquality: operator behaves like equality
+ * isInequality: operator behaves like inequality
+ *
+ * Result:
+ * Selectivity in [0,1], or -1.0 if MCV-based estimation is not applicable.
+ *
+ * Note:
+ * Assumes eqsel()/neqsel semantics. Each element is accounted for once,
+ * either via MCV match or fallback estimation.
+ */
+static double
+scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator,
+ Oid collation, Selectivity nonconst_sel,
+ Datum *elem_values, bool *elem_nulls, int num_elems,
+ bool *elem_const, Oid nominal_element_type,
+ bool useOr, bool isEquality, bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+
+ /*
+ * If the variable is known to be unique, MCV statistics do not represent
+ * a meaningful frequency distribution, so skip MCV-based estimation.
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * For inequality (<>, ALL), we compute probabilities using the negated
+ * equality operator and later transform them as
+ *
+ * p(x <> c) = 1 - p(x = c) - nullfrac
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ /*
+ * If the MCV list and IN-list are large enough, and the operator
+ * supports hashing, attempt to use hash functions so that MCV–IN
+ * matching can be done in O(N+M) instead of O(N×M).
+ */
+ if (sslot.nvalues + num_elems >= MCV_HASH_THRESHOLD)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVHashContext hashContext;
+ Selectivity nonmcv_selec = 0.0;
+ double sumallcommon = 0.0;
+ bool isdefault;
+ bool hash_mcv;
+ double otherdistinct;
+ Datum *arrayHash;
+ Datum *arrayProbe;
+ int nvaluesHash;
+ int nvaluesProbe;
+ int nonmcv_cnt = num_elems;
+ int nonconst_cnt = 0;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ sumallcommon += sslot.numbers[i];
+
+ /*
+ * Compute the total probability mass of all non-MCV values. This is
+ * the part of the column distribution not covered by MCVs.
+ */
+ nonmcv_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(nonmcv_selec);
+
+ /*
+ * Approximate the per-value probability of a non-MCV constant by
+ * dividing the remaining probability mass by the number of other
+ * distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ nonmcv_selec /= otherdistinct;
+
+ if (sslot.nnumbers > 0 && nonmcv_selec > sslot.numbers[sslot.nnumbers - 1])
+ nonmcv_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /* Make sure we build the hash table on the smaller array. */
+ if (sslot.nvalues <= num_elems)
+ {
+ hash_mcv = true;
+ nvaluesHash = sslot.nvalues;
+ nvaluesProbe = num_elems;
+ arrayHash = sslot.values;
+ arrayProbe = elem_values;
+ }
+ else
+ {
+ hash_mcv = false;
+ nvaluesHash = num_elems;
+ nvaluesProbe = sslot.nvalues;
+ arrayHash = elem_values;
+ arrayProbe = sslot.values;
+ }
+
+ fmgr_info(hash_mcv ? hashLeft : hashRight, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.op_is_reversed = hash_mcv;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(hash_mcv ? sslot.valuetype : nominal_element_type,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVHashTable_create(CurrentMemoryContext,
+ nvaluesHash,
+ &hashContext);
+
+ /* Build a hash table over the smaller input side. */
+ for (int i = 0; i < nvaluesHash; i++)
+ {
+ bool found = false;
+ MCVHashEntry *entry;
+
+ /*
+ * When hashing IN-list values (hash_mcv == false), we only insert
+ * constant, non-NULL elements. NULL and non-Const elements are
+ * counted separately, because they cannot participate in MCV
+ * matching and must be handled later using generic selectivity
+ * estimation.
+ */
+ if (!hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ Assert(useOr);
+ nonmcv_cnt--;
+ continue;
+ }
+
+ if (!elem_const[i])
+ {
+ nonmcv_cnt--;
+ nonconst_cnt++;
+ continue;
+ }
+ }
+
+ entry = MCVHashTable_insert(hashTable, arrayHash[i], &found);
+
+ /*
+ * entry->count tracks how many times the same value appears, so
+ * that duplicate IN-list elements can be folded into the
+ * probability calculation.
+ */
+ if (likely(!found))
+ {
+ entry->index = i;
+ entry->count = 1;
+ }
+ else
+ entry->count++;
+ }
+
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hash_mcv ? hashRight : hashLeft, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ for (int i = 0; i < nvaluesProbe; i++)
+ {
+ MCVHashEntry *entry;
+ Selectivity s1;
+ int nvaluesmcv;
+
+ /*
+ * When probing with IN-list elements, ignore NULLs and non-Const
+ * expressions: they cannot be matched against MCVs and will be
+ * accounted for later by generic estimation.
+ */
+ if (hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ Assert(useOr);
+ nonmcv_cnt--;
+ continue;
+ }
+
+ if (!elem_const[i])
+ {
+ nonmcv_cnt--;
+ nonconst_cnt++;
+ continue;
+ }
+ }
+
+ entry = MCVHashTable_lookup(hashTable, arrayProbe[i]);
+
+ /*
+ * If found, obtain its MCV frequency and remember how many values
+ * on the hashed side map to this entry.
+ */
+ if (entry != NULL)
+ {
+ s1 = hash_mcv ? sslot.numbers[entry->index]
+ : sslot.numbers[i];
+
+ nvaluesmcv = entry->count;
+
+ accum_scalararray_prob(s1, nvaluesmcv, useOr, isEquality,
+ isInequality, nullfrac, true, &selec,
+ &s1disjoint);
+
+ /* Matched values are no longer considered non-MCV */
+ nonmcv_cnt -= nvaluesmcv;
+ }
+ }
+
+ nonmcv_cnt = Max(nonmcv_cnt, 0);
+
+ /*
+ * Account for constant IN-list values that did not match any MCV.
+ *
+ * Each such value is assumed to have probability = nonmcv_selec,
+ * derived from the remaining (non-MCV) probability mass.
+ */
+ accum_scalararray_prob(nonmcv_selec, nonmcv_cnt, useOr, isEquality,
+ isInequality, nullfrac, true,
+ &selec, &s1disjoint);
+
+ /*
+ * Account for non-Const IN-list elements.
+ *
+ * These values cannot be matched against MCVs, so we rely on the
+ * operator's generic selectivity estimator for each of them.
+ */
+ accum_scalararray_prob(nonconst_sel, nonconst_cnt, useOr, isEquality,
+ isInequality, nullfrac, false,
+ &selec, &s1disjoint);
+
+ /*
+ * For = ANY or <> ALL, if the IN-list elements are assumed distinct,
+ * the events are disjoint and the total probability is the sum of
+ * individual probabilities. Use that estimate if it lies in [0,1].
+ */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+
+ CLAMP_PROBABILITY(selec);
+
+ MCVHashTable_destroy(hashTable);
+ }
+
+ if (have_mcvs)
+ free_attstatsslot(&sslot);
+
+ return selec;
+}
+
/*
* Estimate number of elements in the array yielded by an expression.
*
@@ -2463,7 +2999,7 @@ eqjoinsel(PG_FUNCTION_ARGS)
* If the MCV lists are long enough to justify hashing, try to look up
* hash functions for the join operator.
*/
- if ((sslot1.nvalues + sslot2.nvalues) >= EQJOINSEL_MCV_HASH_THRESHOLD)
+ if ((sslot1.nvalues + sslot2.nvalues) >= MCV_HASH_THRESHOLD)
(void) get_op_hash_functions(operator, &hashLeft, &hashRight);
}
else
@@ -3103,6 +3639,7 @@ eqjoinsel_find_matches(FmgrInfo *eqproc, Oid collation,
/*
* Support functions for the hash tables used by eqjoinsel_find_matches
+ * and scalararray_mcv_hash_match
*/
static uint32
hash_mcv(MCVHashTable_hash *tab, Datum key)
--
2.34.1
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2026-04-08 16:48 UTC | newest]
Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-12-29 20:35 Hash-based MCV matching for large IN-lists Ilia Evdokimov <[email protected]>
2026-03-02 21:37 ` Zsolt Parragi <[email protected]>
2026-03-10 14:55 ` Ilia Evdokimov <[email protected]>
2026-03-11 08:01 ` Zsolt Parragi <[email protected]>
2026-03-20 15:58 ` Ilia Evdokimov <[email protected]>
2026-04-08 16:48 ` Ilia Evdokimov <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox