public inbox for [email protected]
help / color / mirror / Atom feedRe: Hash-based MCV matching for large IN-lists
10+ messages / 6 participants
[nested] [flat]
* Re: Hash-based MCV matching for large IN-lists
@ 2026-01-14 10:19 Ilia Evdokimov <[email protected]>
0 siblings, 2 replies; 10+ messages in thread
From: Ilia Evdokimov @ 2026-01-14 10:19 UTC (permalink / raw)
To: David Geier <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi David!
Thanks for feedback.
On 05.01.2026 11:54, David Geier wrote:
>> This patch introduces a hash-based matching path, analogous to what is
>> already done for MCV matching in join selectivity estimation (057012b
>> commit). Instead of linearly scanning the MCV array for each IN-list
>> element, we build a hash table and probe it to identify matches.
>>
>> The hash table is built over the MCV values, not over the IN-list. The
>> IN-list may contain NULLs, non-Const expressions, and duplicate values,
>> whereas the MCV list is guaranteed to contain distinct, non-NULL values
>> and represents the statistically meaningful domain we are matching
>> against. Hashing the MCVs therefore avoids duplicate work and directly
>> supports selectivity estimation.
> The downside of doing it this way is that we always pay the price of
> building a possibly big hash table if the column has a lot of MCVs, even
> for small IN lists. Why can't we build the hash table always on the
> smaller list, like we do already in the join selectivity estimation?
>
> For NULL we can add a flag to the hash entry, non-Const expressions must
> anyways be evaluated and duplicate values will be discarded during insert.
After thinking more about this I realized that this is actually a better
match for how selectivity is currently modeled. After this comments in
master
* If we were being really tense we would try to confirm that the
* elements are all distinct, but that would be expensive and it
* doesn't seem to be worth the cycles; it would amount to
penalizing
* well-written queries in favor of poorly-written ones.
However, we
* do protect ourselves a little bit by checking whether the
* disjointness assumption leads to an impossible (out of range)
* probability; if so, we fall back to the normal calculation.
when the hash table is built on the IN-list, duplicate IN-list values
are automatically eliminated during insertion, so we no longer risk
summing the same MCV frequency multiple times. This makes the
disjoint-probability estimate more robust and in practice slightly more
accurate.
One thing I initially missed that there are actually three different
places where ScalarArrayOpExpr is handled - the Const array case, the
ArrayExpr case and others - and Const and ArrayExpr require different
implementation of the same idea. In Const case we can directly hash and
probe Datum value, while ArrayExpr case we must work on Node* element,
separating constant and non-constant entries and only hashing the
constants. The current v2 therefore applies the same MCV-hash
optimization in both branches, but using two tailored code paths that
preserve the existing semantics of how non-Const elements are handled by
var_eq_non_const().
If the MCV list is smaller than the IN-list, the behavior is the same as
in v1 of the patch. If the IN-list is smaller, we instead build a hash
table over the distinct constant elements of the IN-list and then:
- Scan the MCV list and sum the frequencies of those MCVs that appear in
the IN-list;
- Count how many distinct IN-list not null constant elements are not
present in the MCV list;
- Estimate the probability of each such non-MCV value using the
remaining frequency mass;
- Handle non-constant IN-list elements separately using
var_eq_non_const(), exactly as in the existing implementation.
>> For each IN-list element, if a matching MCV is found, we add the
>> corresponding MCV frequency to the selectivity estimate. If no match is
>> found, the remaining selectivity is estimated in the same way as the
>> existing non-MCV path (similar to var_eq_const when the constant is not
>> present in the MCV list).
>>
> The code in master currently calls an operator-specific selectivity
> estimation function. For equality this is typically eqsel() but the
> function can be specified during CREATE OPERATOR.
>
> Can be safely special-case the behavior of eqsel() for all possible
> operators for the ScalarArrayOpExpr case?
Unfortunately there is no safe way to make this optimization generic for
arbitrary restrict functions, because a custom RESTRICT function does
not have to use MCVs at all. IMO, in practice the vast majority of
ScalarArrayOpExpr uses with = or <> rely on the built-in equality
operators whose selectivity is computed by eqsel()/neqsel(), so I
limited this optimization to those cases.
I’ve attached v2 of the patch. It currently uses two fairly large helper
functions for the Const and ArrayExpr cases; this is intentional to keep
the logic explicit and reviewable, even though these will likely need
refactoring or consolidation later.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
[text/x-patch] v2-0001-Use-hash-based-matching-for-MCVs-in-ScalarArrayOp.patch (37.2K, 3-v2-0001-Use-hash-based-matching-for-MCVs-in-ScalarArrayOp.patch)
download | inline diff:
From 933c89c5a2f42bb53c5663bc46f25473bf2a97ba Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <[email protected]>
Date: Wed, 14 Jan 2026 13:13:21 +0300
Subject: [PATCH v2] Use hash-based matching for MCVs in ScalarArrayOpExpr
selectivity
When estimating selectivity for ScalarArrayOpExpr (IN / ANY / ALL) with
available MCV statistics, the planner currently matches IN-list elements
against the MCV array using nested loops. For large IN-lists and/or MCV
lists this results in O(N*M) planning-time behavior.
This patch introduces a hash-based matching strategy, analogous to what
is already used for MCV matching in join selectivity estimation. The
implementation covers both ScalarArrayOpExpr forms:
1. Const array, where IN-list elements are available as Datums, and
2. ArrayExpr, where elements are Nodes and may include non-constant
expressions.
In both cases, when MCV statistics and suitable hash functions are
available, the algorithm chooses the smaller side to hash: either the
MCV list or the distinct constant elements of the IN-list. The other
side is then scanned once, reducing the complexity from O(N*M) to
O(N+M).
For the ArrayExpr case, only constant IN-list elements participate in
hashing and MCV matching; non-constant elements are handled separately
using var_eq_non_const(), preserving the existing semantics and
probability model.
When the hash table is built over the IN-list, duplicate constant values
are eliminated during insertion. This prevents double counting of MCV
frequencies and produces a more robust disjoint-probability estimate for
= ANY and <> ALL.
The hash-based path is used only for equality and inequality operators
that use eqsel()/neqsel(), when MCV statistics are available and suitable
hash functions exist for the operator.
---
src/backend/utils/adt/selfuncs.c | 1060 +++++++++++++++++++++++++++++-
src/tools/pgindent/typedefs.list | 3 +
2 files changed, 1062 insertions(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 29fec655593..161d3de530e 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -146,7 +146,7 @@
/*
* In production builds, switch to hash-based MCV matching when the lists are
* large enough to amortize hash setup cost. (This threshold is compared to
- * the sum of the lengths of the two MCV lists. This is simplistic but seems
+ * the sum of the lengths of the two lists. This is simplistic but seems
* to work well enough.) In debug builds, we use a smaller threshold so that
* the regression tests cover both paths well.
*/
@@ -156,6 +156,12 @@
#define EQJOINSEL_MCV_HASH_THRESHOLD 20
#endif
+#ifndef USE_ASSERT_CHECKING
+#define SCALARARRAY_MCV_HASH_THRESHOLD 200
+#else
+#define SCALARARRAY_MCV_HASH_THRESHOLD 20
+#endif
+
/* Entries in the simplehash hash table used by eqjoinsel_find_matches */
typedef struct MCVHashEntry
{
@@ -176,14 +182,44 @@ typedef struct MCVHashContext
int16 hash_typlen; /* typlen of hashed data type */
} MCVHashContext;
+/* Entries in the simplehash hash table used by scalararray_mcv_hash_match */
+typedef struct MCVInHashEntry
+{
+ Datum value; /* the value represented by this entry */
+ int index; /* its index in the relevant AttStatsSlot */
+ uint32 hash; /* hash code for the Datum */
+ char status; /* status code used by simplehash.h */
+} MCVInHashEntry;
+
+/* private_data for the simplehash hash table */
+typedef struct MCVInHashContext
+{
+ FunctionCallInfo equal_fcinfo; /* the equality join operator */
+ FunctionCallInfo hash_fcinfo; /* the hash function to use */
+ bool insert_mode; /* doing inserts or lookups? */
+ bool hash_typbyval; /* typbyval of hashed data type */
+ int16 hash_typlen; /* typlen of hashed data type */
+} MCVInHashContext;
+
/* forward reference */
typedef struct MCVHashTable_hash MCVHashTable_hash;
+typedef struct MCVInHashTable_hash MCVInHashTable_hash;
/* Hooks for plugins to get control when we ask for stats */
get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
+static double scalararray_mcv_hash_match_expr(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left, ArrayExpr *arrayexpr,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality);
+static double scalararray_mcv_hash_match_const(VariableStatData *vardata, Oid operator, Oid collation,
+ Datum *elem_values, bool *elem_nulls, int num_elems,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality);
+static uint32 hash_mcv_in(MCVInHashTable_hash *tab, Datum key);
+static bool mcvs_in_equal(MCVInHashTable_hash *tab, Datum key0, Datum key1);
static double eqjoinsel_inner(FmgrInfo *eqproc, Oid collation,
Oid hashLeft, Oid hashRight,
VariableStatData *vardata1, VariableStatData *vardata2,
@@ -287,6 +323,19 @@ static double btcost_correlation(IndexOptInfo *index,
#define SH_DECLARE
#include "lib/simplehash.h"
+#define SH_PREFIX MCVInHashTable
+#define SH_ELEMENT_TYPE MCVInHashEntry
+#define SH_KEY_TYPE Datum
+#define SH_KEY value
+#define SH_HASH_KEY(tab,key) hash_mcv_in(tab, key)
+#define SH_EQUAL(tab,key0,key1) mcvs_in_equal(tab, key0, key1)
+#define SH_SCOPE static inline
+#define SH_STORE_HASH
+#define SH_GET_HASH(tab,ent) (ent)->hash
+#define SH_DEFINE
+#define SH_DECLARE
+#include "lib/simplehash.h"
+
/*
* eqsel - Selectivity of "=" for any data types.
@@ -2025,6 +2074,35 @@ scalararraysel(PlannerInfo *root,
elmlen, elmbyval, elmalign,
&elem_values, &elem_nulls, &num_elems);
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching. We use hash-search only for eqsel() and
+ * neqsel().
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then punt and return a default estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ s1 = scalararray_mcv_hash_match_const(&vardata, operator, clause->inputcollid,
+ elem_values, elem_nulls, num_elems,
+ nominal_element_type, useOr, isEquality, isInequality);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* For generic operators, we assume the probability of success is
* independent for each array element. But for "= ANY" or "<> ALL",
@@ -2100,6 +2178,35 @@ scalararraysel(PlannerInfo *root,
get_typlenbyval(arrayexpr->element_typeid,
&elmlen, &elmbyval);
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching. We use hash-search only for eqsel() and
+ * neqsel().
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then punt and return a default estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ s1 = scalararray_mcv_hash_match_expr(&vardata, operator, clause->inputcollid,
+ other_op, var_on_left, arrayexpr,
+ nominal_element_type, useOr, isEquality, isInequality);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* We use the assumption of disjoint probabilities here too, although
* the odds of equal array elements are rather higher if the elements
@@ -2210,6 +2317,957 @@ scalararraysel(PlannerInfo *root,
return s1;
}
+/*
+ * Estimate selectivity of a ScalarArrayOpExpr (ANY/ALL) using MCV statistics.
+ *
+ * This function implements the same probability model as the standard
+ * ScalarArrayOpExpr estimation code (independent or disjoint probabilities
+ * for OR/AND), but uses MCV statistics and hashing to compute
+ * per-element probabilities efficiently when possible.
+ *
+ * MCVs are used only to obtain more accurate values; the combination
+ * of frequencies follows the standard ANY/ALL formulas.
+ *
+ * Inputs:
+ * vardata: statistics and metadata for the variable being estimated
+ * operator: equality or inequality operator to apply
+ * collation: OID of collation to use
+ * other_op: expression for the non-variable side of the comparison
+ * var_on_left: true if the variable is on the left side of the operator
+ * arrayexpr: array of IN-list with expressions
+ * nominal_element_type: type of IN-list elements
+ * useOr: true if elements are combined using OR semantics, false for AND
+ * isEquality: true if the operator is equality
+ * isInequality: true if the operator is inequality
+ *
+ * Result:
+ * Returns a selectivity estimate in the range [0.0, 1.0], or -1.0 if the
+ * selectivity cannot be reliably estimated by this function.
+ *
+ * Note: this function assumes that the operator’s selectivity behavior
+ * matches eqsel()/neqsel semantics (equality or inequality).
+ * It must not be used for operators with custom or non-standard
+ * selectivity functions.
+ */
+static double
+scalararray_mcv_hash_match_expr(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left, ArrayExpr *arrayexpr,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+ int num_elems = 0;
+ ListCell *l;
+
+ /*
+ * If the variable is known to be unique, MCV statistics do not represent
+ * a meaningful frequency distribution, so skip MCV-based estimation.
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * For inequality (<>, ALL), we compute probabilities using the negated
+ * equality operator and later transform them as
+ *
+ * p(x <> c) = 1 - p(x = c) - nullfrac
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+
+ num_elems = list_length(arrayexpr->elements);
+
+ /*
+ * f the MCV list and IN-list are large enough, and the operator
+ * supports hashing, attempt to use hash functions so that MCV–IN
+ * matching can be done in O(N+M) instead of O(N×M).
+ */
+ if (sslot.nvalues + num_elems >= SCALARARRAY_MCV_HASH_THRESHOLD)
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVInHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVInHashContext hashContext;
+ double sumallcommon = 0.0,
+ remaining_selec = 0.0;
+ bool isdefault;
+ double otherdistinct;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ if (sslot.nvalues <= num_elems)
+ {
+ /*
+ * Build a hash table over the MCV values.
+ *
+ * For each IN-list element, we compute p(x = element): constants
+ * are matched against MCVs (or the residual frequency), while
+ * non-constant elements fall back to the operator's generic
+ * selectivity estimator. These probabilities are then combined
+ * using the standard ANY/ALL formulas.
+ */
+ fmgr_info(hashLeft, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(sslot.valuetype,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVInHashTable_create(CurrentMemoryContext,
+ sslot.nvalues,
+ &hashContext);
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ {
+ bool found = false;
+ MCVInHashEntry *entry;
+
+ entry = MCVInHashTable_insert(hashTable, sslot.values[i], &found);
+
+ /*
+ * MCVHashTable_insert will only report "found" if the new
+ * value is equal to some previous one per datum_image_eq().
+ * That probably shouldn't happen, since we're not expecting
+ * duplicates in the MCV list. If we do find a dup, just
+ * ignore it, leaving the hash entry's index pointing at the
+ * first occurrence. That matches the behavior that the
+ * non-hashed code path would have.
+ */
+ if (likely(!found))
+ entry->index = i;
+
+ sumallcommon += sslot.numbers[i];
+ }
+
+ /*
+ * Prepare to probe the hash table. If the probe values are of a
+ * different data type, then we need to change hash functions.
+ * (This code relies on the assumption that since we defined
+ * SH_STORE_HASH, simplehash.h will never need to compute hash
+ * values for existing hash table entries.)
+ */
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hashRight, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ /*
+ * remaining_selec is the total probability mass of all non-MCV
+ * values (i.e., 1 - sum(MCV frequencies) - nullfrac). This mass
+ * is later divided among the remaining distinct values to
+ * estimate p(x = c) for constants not present in the MCV list.
+ */
+ remaining_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(remaining_selec);
+
+ /*
+ * and in fact it's probably a good deal less. We approximate that
+ * all the not-common values share this remaining fraction
+ * equally, so we divide by the number of other distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ remaining_selec /= otherdistinct;
+
+ /*
+ * Another cross-check: selectivity shouldn't be estimated as more
+ * than the least common "most common value".
+ */
+ if (sslot.nnumbers > 0 && remaining_selec > sslot.numbers[sslot.nnumbers - 1])
+ remaining_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /* Evaluate selectivity contribution of each IN-list element. */
+ foreach(l, arrayexpr->elements)
+ {
+ MCVInHashEntry *entry;
+ Selectivity s1;
+ Node *elem_value = (Node *) lfirst(l);
+
+ if (IsA(elem_value, Const))
+ {
+ /*
+ * If the constant is NULL, assume operator is strict and
+ * return zero, ie, operator will never return TRUE.
+ * (It's zero even for a negator op.)
+ */
+ if (((Const *) elem_value)->constisnull)
+ continue;
+
+ entry = MCVInHashTable_lookup(hashTable, ((Const *) elem_value)->constvalue);
+
+ if (entry != NULL)
+ {
+ /*
+ * As in the other code path, skip already-matched
+ * hash entries
+ */
+ s1 = sslot.numbers[entry->index];
+ }
+ else
+ {
+ s1 = remaining_selec;
+ }
+
+ if (isInequality)
+ s1 = 1.0 - s1 - nullfrac;
+ }
+ else
+ {
+ s1 = var_eq_non_const(vardata, operator, collation, other_op,
+ var_on_left, isInequality);
+ }
+
+ CLAMP_PROBABILITY(s1);
+
+ if (useOr)
+ {
+ selec = selec + s1 - selec * s1;
+ if (isEquality)
+ s1disjoint += s1;
+ }
+ else
+ {
+ selec = selec * s1;
+ if (isInequality)
+ s1disjoint += s1 - 1.0;
+ }
+ }
+
+ MCVInHashTable_destroy(hashTable);
+
+ /*
+ * For = ANY or <> ALL, if the IN-list elements are assumed
+ * distinct, the events are disjoint and the total probability is
+ * the sum of individual probabilities. Use that estimate if it
+ * lies in [0,1].
+ */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+ }
+ else
+ {
+ /*
+ * Here the IN-list is shorter than the MCV list.
+ *
+ * We build a hash table over the IN-list values so that we can
+ * quickly determine which MCVs are referenced by the query.
+ *
+ * We then: - sum probabilities of MCVs that appear in the IN-list
+ * - count how many IN-list values are not MCVs - estimate their
+ * probabilities from the remaining frequency mass - combine
+ * everything using the standard ANY/ALL probability rules
+ */
+ int distinct_in = 0;
+ int mcv_matches = 0;
+ int non_mcv_in = 0;
+ double mcv_sum = 0.0;
+ Selectivity disjoint_sel = 0.0;
+
+ fmgr_info(hashRight, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(nominal_element_type,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ /*
+ * Build a hash table over the distinct constant values from the
+ * IN-list. Duplicates are ignored because ANY/ALL semantics
+ * assume the elements are distinct for disjoint-probability
+ * estimation.
+ */
+ hashTable = MCVInHashTable_create(CurrentMemoryContext,
+ num_elems,
+ &hashContext);
+
+ /*
+ * Insert all constant IN-list elements into the hash table,
+ * keeping only one copy of each distinct value.
+ */
+ foreach(l, arrayexpr->elements)
+ {
+ bool found = false;
+ Node *elem_value = (Node *) lfirst(l);
+
+ if (!IsA(elem_value, Const))
+ continue;
+
+ if (((Const *) elem_value)->constisnull)
+ continue;
+
+ MCVInHashTable_insert(hashTable, ((Const *) elem_value)->constvalue, &found);
+
+ if (likely(!found))
+ distinct_in++;
+ }
+
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hashLeft, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ /*
+ * Scan the MCV list once.
+ *
+ * For each MCV value: - always add its frequency to sumallcommon
+ * (needed to compute the remaining non-MCV probability mass) - if
+ * the MCV appears in the IN-list, record it as a match and add
+ * its probability to mcv_sum
+ *
+ * At the same time we also fold its contribution into the running
+ * ANY/ALL probability (selec), exactly as the generic code does.
+ */
+ for (int i = 0; i < sslot.nvalues; i++)
+ {
+ MCVInHashEntry *entry;
+
+ sumallcommon += sslot.numbers[i];
+
+ entry = MCVInHashTable_lookup(hashTable, sslot.values[i]);
+ if (entry != NULL)
+ {
+ Selectivity s1 = sslot.numbers[i];
+
+ mcv_sum += s1;
+ mcv_matches++;
+
+ if (isInequality)
+ s1 = 1.0 - s1 - nullfrac;
+
+ CLAMP_PROBABILITY(s1);
+
+ if (useOr)
+ selec = selec + s1 - selec * s1;
+ else
+ selec = selec * s1;
+ }
+ }
+
+ /*
+ * Compute the total probability mass of all non-MCV values. This
+ * is the part of the column distribution not covered by MCVs.
+ */
+ remaining_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(remaining_selec);
+
+ /*
+ * Approximate the per-value probability of a non-MCV constant by
+ * dividing the remaining probability mass by the number of other
+ * distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ remaining_selec /= otherdistinct;
+
+ if (sslot.nnumbers > 0 && remaining_selec > sslot.numbers[sslot.nnumbers - 1])
+ remaining_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /*
+ * Number of IN-list values that are not MCVs. Each of these is
+ * assumed to have probability = remaining_selec.
+ */
+ non_mcv_in = distinct_in - mcv_matches;
+ Assert(non_mcv_in >= 0);
+
+ /*
+ * Compute the disjoint-probability estimate:
+ *
+ * s = sum of probabilities of all MCV values present in the
+ * IN-list + sum of probabilities of all non-MCV IN-list values
+ *
+ * This is valid for = ANY and <> ALL when the IN-list elements
+ * are assumed distinct.
+ */
+ disjoint_sel = mcv_sum + non_mcv_in * remaining_selec;
+
+ if (isInequality)
+ disjoint_sel = 1.0 - disjoint_sel - nullfrac;
+
+ if ((useOr ? isEquality : isInequality) &&
+ disjoint_sel >= 0.0 && disjoint_sel <= 1.0)
+ selec = disjoint_sel;
+
+ /*
+ * These elements cannot be matched against MCVs and therefore
+ * must always be handled by the generic estimator. Their
+ * probabilities are combined into the same ANY/ALL probability
+ * chain to preserve the correct probability model.
+ */
+ foreach(l, arrayexpr->elements)
+ {
+ Selectivity s1 = 0.0;
+ Node *elem_value = (Node *) lfirst(l);
+
+ if (IsA(elem_value, Const))
+ continue;
+
+ s1 = var_eq_non_const(vardata, operator, collation,
+ other_op, var_on_left, isInequality);
+
+ if (isInequality)
+ s1 = 1.0 - s1 - nullfrac;
+
+ CLAMP_PROBABILITY(s1);
+
+ if (useOr)
+ selec = selec + s1 - selec * s1;
+ else
+ selec = selec * s1;
+ }
+
+ MCVInHashTable_destroy(hashTable);
+ }
+
+ CLAMP_PROBABILITY(selec);
+ }
+
+ free_attstatsslot(&sslot);
+
+ return selec;
+}
+
+/*
+ * Estimate selectivity of a ScalarArrayOpExpr (ANY/ALL) using MCV statistics.
+ *
+ * This function implements the same probability model as the standard
+ * ScalarArrayOpExpr estimation code (independent or disjoint probabilities
+ * for OR/AND), but uses MCV statistics and hashing to compute
+ * per-element probabilities efficiently when possible.
+ *
+ * MCVs are used only to obtain more accurate values; the combination
+ * of frequencies follows the standard ANY/ALL formulas.
+ *
+ * Inputs:
+ * vardata: statistics and metadata for the variable being estimated
+ * operator: equality or inequality operator to apply
+ * collation: OID of collation to use
+ * other_op: expression for the non-variable side of the comparison
+ * var_on_left: true if the variable is on the left side of the operator
+ * elem_values: array of IN-list element const values
+ * elem_nulls: array indicating which IN-list elements are NULL
+ * num_elems: number of IN-list const elements
+ * nominal_element_type: type of IN-list elements
+ * useOr: true if elements are combined using OR semantics, false for AND
+ * isEquality: true if the operator is equality
+ * isInequality: true if the operator is inequality
+ *
+ * Result:
+ * Returns a selectivity estimate in the range [0.0, 1.0], or -1.0 if the
+ * selectivity cannot be reliably estimated by this function.
+ *
+ * Note: this function assumes that the operator’s selectivity behavior
+ * matches eqsel()/neqsel semantics (equality or inequality).
+ * It must not be used for operators with custom or non-standard
+ * selectivity functions.
+ */
+static double
+scalararray_mcv_hash_match_const(VariableStatData *vardata, Oid operator, Oid collation,
+ Datum *elem_values, bool *elem_nulls, int num_elems,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+
+ /*
+ * If the variable is known to be unique, MCV statistics do not represent
+ * a meaningful frequency distribution, so skip MCV-based estimation.
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * For inequality (<>, ALL), we compute probabilities using the negated
+ * equality operator and later transform them as
+ *
+ * p(x <> c) = 1 - p(x = c) - nullfrac
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+
+ /*
+ * f the MCV list and IN-list are large enough, and the operator
+ * supports hashing, attempt to use hash functions so that MCV–IN
+ * matching can be done in O(N+M) instead of O(N×M).
+ */
+ if (sslot.nvalues + num_elems >= SCALARARRAY_MCV_HASH_THRESHOLD)
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVInHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVInHashContext hashContext;
+ double sumallcommon = 0.0,
+ remaining_selec = 0.0;
+ bool isdefault;
+ double otherdistinct;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ if (sslot.nvalues <= num_elems)
+ {
+ /*
+ * Build a hash table over the MCV values.
+ *
+ * For each IN-list element, we compute p(x = element): constants
+ * are matched against MCVs (or the residual frequency). These
+ * probabilities are then combined using the standard ANY/ALL
+ * formulas.
+ */
+ fmgr_info(hashLeft, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(sslot.valuetype,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVInHashTable_create(CurrentMemoryContext,
+ sslot.nvalues,
+ &hashContext);
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ {
+ bool found = false;
+ MCVInHashEntry *entry;
+
+ entry = MCVInHashTable_insert(hashTable, sslot.values[i], &found);
+
+ /*
+ * MCVHashTable_insert will only report "found" if the new
+ * value is equal to some previous one per datum_image_eq().
+ * That probably shouldn't happen, since we're not expecting
+ * duplicates in the MCV list. If we do find a dup, just
+ * ignore it, leaving the hash entry's index pointing at the
+ * first occurrence. That matches the behavior that the
+ * non-hashed code path would have.
+ */
+ if (likely(!found))
+ entry->index = i;
+
+ sumallcommon += sslot.numbers[i];
+ }
+
+ /*
+ * Prepare to probe the hash table. If the probe values are of a
+ * different data type, then we need to change hash functions.
+ * (This code relies on the assumption that since we defined
+ * SH_STORE_HASH, simplehash.h will never need to compute hash
+ * values for existing hash table entries.)
+ */
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hashRight, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ /*
+ * remaining_selec is the total probability mass of all non-MCV
+ * values (i.e., 1 - sum(MCV frequencies) - nullfrac). This mass
+ * is later divided among the remaining distinct values to
+ * estimate p(x = c) for constants not present in the MCV list.
+ */
+ remaining_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(remaining_selec);
+
+ /*
+ * and in fact it's probably a good deal less. We approximate that
+ * all the not-common values share this remaining fraction
+ * equally, so we divide by the number of other distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ remaining_selec /= otherdistinct;
+
+ /*
+ * Another cross-check: selectivity shouldn't be estimated as more
+ * than the least common "most common value".
+ */
+ if (sslot.nnumbers > 0 && remaining_selec > sslot.numbers[sslot.nnumbers - 1])
+ remaining_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /* Evaluate selectivity contribution of each IN-list element. */
+ for (int i = 0; i < num_elems; i++)
+ {
+ MCVInHashEntry *entry;
+ Selectivity s1;
+
+ /*
+ * If the constant is NULL, assume operator is strict and
+ * return zero, ie, operator will never return TRUE. (It's
+ * zero even for a negator op.)
+ */
+ if (elem_nulls[i])
+ continue;
+
+ entry = MCVInHashTable_lookup(hashTable, elem_values[i]);
+
+ if (entry != NULL)
+ {
+ /*
+ * As in the other code path, skip already-matched hash
+ * entries
+ */
+ s1 = sslot.numbers[entry->index];
+ }
+ else
+ s1 = remaining_selec;
+
+ if (isInequality)
+ s1 = 1.0 - s1 - nullfrac;
+
+ CLAMP_PROBABILITY(s1);
+
+ if (useOr)
+ {
+ selec = selec + s1 - selec * s1;
+ if (isEquality)
+ s1disjoint += s1;
+ }
+ else
+ {
+ selec = selec * s1;
+ if (isInequality)
+ s1disjoint += s1 - 1.0;
+ }
+ }
+
+ MCVInHashTable_destroy(hashTable);
+
+ /*
+ * For = ANY or <> ALL, if the IN-list elements are assumed
+ * distinct, the events are disjoint and the total probability is
+ * the sum of individual probabilities. Use that estimate if it
+ * lies in [0,1].
+ */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+ }
+ else
+ {
+ /*
+ * Here the IN-list is shorter than the MCV list.
+ *
+ * We build a hash table over the IN-list values so that we can
+ * quickly determine which MCVs are referenced by the query.
+ *
+ * We then: - sum probabilities of MCVs that appear in the IN-list
+ * - count how many IN-list values are not MCVs - estimate their
+ * probabilities from the remaining frequency mass - combine
+ * everything using the standard ANY/ALL probability rules
+ */
+ int distinct_in = 0;
+ int mcv_matches = 0;
+ int non_mcv_in = 0;
+ double mcv_sum = 0.0;
+ Selectivity disjoint_sel = 0.0;
+
+ fmgr_info(hashRight, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(nominal_element_type,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ /*
+ * Build a hash table over the distinct constant values from the
+ * IN-list. Duplicates are ignored because ANY/ALL semantics
+ * assume the elements are distinct for disjoint-probability
+ * estimation.
+ */
+ hashTable = MCVInHashTable_create(CurrentMemoryContext,
+ num_elems,
+ &hashContext);
+
+ /*
+ * Insert all constant IN-list elements into the hash table,
+ * keeping only one copy of each distinct value.
+ */
+ for (int i = 0; i < num_elems; i++)
+ {
+ bool found = false;
+
+ if (elem_nulls[i])
+ continue;
+
+ MCVInHashTable_insert(hashTable, elem_values[i], &found);
+
+ if (likely(!found))
+ distinct_in++;
+ }
+
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hashLeft, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ /*
+ * Scan the MCV list once.
+ *
+ * For each MCV value: - always add its frequency to sumallcommon
+ * (needed to compute the remaining non-MCV probability mass) - if
+ * the MCV appears in the IN-list, record it as a match and add
+ * its probability to mcv_sum
+ *
+ * At the same time we also fold its contribution into the running
+ * ANY/ALL probability (selec), exactly as the generic code does.
+ */
+ for (int i = 0; i < sslot.nvalues; i++)
+ {
+ MCVInHashEntry *entry;
+
+ sumallcommon += sslot.numbers[i];
+
+ entry = MCVInHashTable_lookup(hashTable, sslot.values[i]);
+ if (entry != NULL)
+ {
+ Selectivity s1 = sslot.numbers[i];
+
+ mcv_sum += s1;
+ mcv_matches++;
+
+ if (isInequality)
+ s1 = 1.0 - s1 - nullfrac;
+
+ CLAMP_PROBABILITY(s1);
+
+ if (useOr)
+ selec = selec + s1 - selec * s1;
+ else
+ selec = selec * s1;
+ }
+ }
+
+ /*
+ * Compute the total probability mass of all non-MCV values. This
+ * is the part of the column distribution not covered by MCVs.
+ */
+ remaining_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(remaining_selec);
+
+ /*
+ * Approximate the per-value probability of a non-MCV constant by
+ * dividing the remaining probability mass by the number of other
+ * distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ remaining_selec /= otherdistinct;
+
+ if (sslot.nnumbers > 0 && remaining_selec > sslot.numbers[sslot.nnumbers - 1])
+ remaining_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /*
+ * Number of IN-list values that are not MCVs. Each of these is
+ * assumed to have probability = remaining_selec.
+ */
+ non_mcv_in = distinct_in - mcv_matches;
+ Assert(non_mcv_in >= 0);
+
+ /*
+ * Compute the disjoint-probability estimate:
+ *
+ * s = sum of probabilities of all MCV values present in the
+ * IN-list + sum of probabilities of all non-MCV IN-list values
+ *
+ * This is valid for = ANY and <> ALL when the IN-list elements
+ * are assumed distinct.
+ */
+ disjoint_sel = mcv_sum + non_mcv_in * remaining_selec;
+
+ if (isInequality)
+ disjoint_sel = 1.0 - disjoint_sel - nullfrac;
+
+ if ((useOr ? isEquality : isInequality) &&
+ disjoint_sel >= 0.0 && disjoint_sel <= 1.0)
+ selec = disjoint_sel;
+
+ MCVInHashTable_destroy(hashTable);
+ }
+
+ CLAMP_PROBABILITY(selec);
+ }
+
+ free_attstatsslot(&sslot);
+
+ return selec;
+}
+
+/*
+ * Support functions for the hash tables used by eqjoinsel_find_matches
+ */
+static uint32
+hash_mcv_in(MCVInHashTable_hash *tab, Datum key)
+{
+ MCVInHashContext *context = (MCVInHashContext *) tab->private_data;
+ FunctionCallInfo fcinfo = context->hash_fcinfo;
+ Datum fresult;
+
+ fcinfo->args[0].value = key;
+ fcinfo->isnull = false;
+ fresult = FunctionCallInvoke(fcinfo);
+ Assert(!fcinfo->isnull);
+ return DatumGetUInt32(fresult);
+}
+
+static bool
+mcvs_in_equal(MCVInHashTable_hash *tab, Datum key0, Datum key1)
+{
+ MCVInHashContext *context = (MCVInHashContext *) tab->private_data;
+
+ if (context->insert_mode)
+ {
+ /*
+ * During the insertion step, any comparisons will be between two
+ * Datums of the hash table's data type, so if the given operator is
+ * cross-type it will be the wrong thing to use. Fortunately, we can
+ * use datum_image_eq instead. The MCV values should all be distinct
+ * anyway, so it's mostly pro-forma to compare them at all.
+ */
+ return datum_image_eq(key0, key1,
+ context->hash_typbyval, context->hash_typlen);
+ }
+ else
+ {
+ FunctionCallInfo fcinfo = context->equal_fcinfo;
+ Datum fresult;
+
+ fcinfo->args[0].value = key0;
+ fcinfo->args[1].value = key1;
+ fcinfo->isnull = false;
+ fresult = FunctionCallInvoke(fcinfo);
+ return (!fcinfo->isnull && DatumGetBool(fresult));
+ }
+}
+
/*
* Estimate number of elements in the array yielded by an expression.
*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 14dec2d49c1..adfe8f4c588 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1667,6 +1667,9 @@ MBuf
MCVHashContext
MCVHashEntry
MCVHashTable_hash
+MCVInHashContext
+MCVInHashEntry
+MCVInHashTable_hash
MCVItem
MCVList
MEMORY_BASIC_INFORMATION
--
2.34.1
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-01-19 14:01 David Geier <[email protected]>
parent: Ilia Evdokimov <[email protected]>
1 sibling, 1 reply; 10+ messages in thread
From: David Geier @ 2026-01-19 14:01 UTC (permalink / raw)
To: Ilia Evdokimov <[email protected]>; PostgreSQL Hackers <[email protected]>
On 14.01.2026 11:19, Ilia Evdokimov wrote:
> After thinking more about this I realized that this is actually a better
> match for how selectivity is currently modeled. After this comments in
> master
>
> * If we were being really tense we would try to confirm that the
> * elements are all distinct, but that would be expensive and it
> * doesn't seem to be worth the cycles; it would amount to
> penalizing
> * well-written queries in favor of poorly-written ones.
> However, we
> * do protect ourselves a little bit by checking whether the
> * disjointness assumption leads to an impossible (out of range)
> * probability; if so, we fall back to the normal calculation.
>
> when the hash table is built on the IN-list, duplicate IN-list values
> are automatically eliminated during insertion, so we no longer risk
> summing the same MCV frequency multiple times. This makes the disjoint-
> probability estimate more robust and in practice slightly more accurate.
Does that mean that we get a different estimation result, depending on
if the IN list is smaller or not? I think we should avoid that because
estimation quality might flip for the user unexpectedly.
> One thing I initially missed that there are actually three different
> places where ScalarArrayOpExpr is handled - the Const array case, the
> ArrayExpr case and others - and Const and ArrayExpr require different
> implementation of the same idea. In Const case we can directly hash and
> probe Datum value, while ArrayExpr case we must work on Node* element,
> separating constant and non-constant entries and only hashing the
> constants. The current v2 therefore applies the same MCV-hash
> optimization in both branches, but using two tailored code paths that
> preserve the existing semantics of how non-Const elements are handled by
> var_eq_non_const().
>
> If the MCV list is smaller than the IN-list, the behavior is the same as
> in v1 of the patch. If the IN-list is smaller, we instead build a hash
> table over the distinct constant elements of the IN-list and then:
> - Scan the MCV list and sum the frequencies of those MCVs that appear in
> the IN-list;
> - Count how many distinct IN-list not null constant elements are not
> present in the MCV list;
Is this to make sure we keep getting the same estimation result if the
IN list is smaller and contains duplicates?
> - Estimate the probability of each such non-MCV value using the
> remaining frequency mass;
> - Handle non-constant IN-list elements separately using
> var_eq_non_const(), exactly as in the existing implementation.
OK
>>>
>> The code in master currently calls an operator-specific selectivity
>> estimation function. For equality this is typically eqsel() but the
>> function can be specified during CREATE OPERATOR.
>>
>> Can be safely special-case the behavior of eqsel() for all possible
>> operators for the ScalarArrayOpExpr case?
>
>
> Unfortunately there is no safe way to make this optimization generic for
> arbitrary restrict functions, because a custom RESTRICT function does
> not have to use MCVs at all. IMO, in practice the vast majority of
> ScalarArrayOpExpr uses with = or <> rely on the built-in equality
> operators whose selectivity is computed by eqsel()/neqsel(), so I
> limited this optimization to those cases.
How did you do that? I cannot find the code that checks for that.
> I’ve attached v2 of the patch. It currently uses two fairly large helper
> functions for the Const and ArrayExpr cases; this is intentional to keep
> the logic explicit and reviewable, even though these will likely need
> refactoring or consolidation later.
Beyond that, it seems like you can also combine/reuse a bunch of code
for creating the hash map on the IN vs on the MCV list.
For the MCVs, can't we reuse some code from the eqjoinsel() optimization
we did? The entry and context structs look similar enough to only need one.
Making the code more compact would ease reviewing a lot.
--
David Geier
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-01-27 15:43 Ilia Evdokimov <[email protected]>
parent: David Geier <[email protected]>
0 siblings, 0 replies; 10+ messages in thread
From: Ilia Evdokimov @ 2026-01-27 15:43 UTC (permalink / raw)
To: David Geier <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi,
On 19.01.2026 17:01, David Geier wrote:
> Does that mean that we get a different estimation result, depending on
> if the IN list is smaller or not? I think we should avoid that because
> estimation quality might flip for the user unexpectedly.
I think you're right.
To address this, I changed the hash-table entry to track an additional
'count' filed, representing how many times a particular value appears on
the hashed side. When inserting into the hash table, if the value is
already present, I increment 'count', otherwise, I create a new entry
with count = 1
>>> The code in master currently calls an operator-specific selectivity
>>> estimation function. For equality this is typically eqsel() but the
>>> function can be specified during CREATE OPERATOR.
>>>
>>> Can be safely special-case the behavior of eqsel() for all possible
>>> operators for the ScalarArrayOpExpr case?
>>
>> Unfortunately there is no safe way to make this optimization generic for
>> arbitrary restrict functions, because a custom RESTRICT function does
>> not have to use MCVs at all. IMO, in practice the vast majority of
>> ScalarArrayOpExpr uses with = or <> rely on the built-in equality
>> operators whose selectivity is computed by eqsel()/neqsel(), so I
>> limited this optimization to those cases.
> How did you do that? I cannot find the code that checks for that.
In scalararraysel(), before attempting the hash-based path, we determine
whether the operator behaves like equality or inequality based on its
selectivity function:
if (oprsel == F_EQSEL || oprsel == F_EQJOINSEL)
isEquality = true;
else if (oprsel == F_NEQSEL || oprsel == F_NEQJOINSEL)
isInequality = true;
Then the hash-based MCV matching is only attempted under:
if ((isEquality || isInequality) && !is_join_clause)
So effectively this restricts the optimization to operators whose
selectivity is computed by eqsel()/neqsel() on restriction clauses. Join
clauses (which would use eqjoinsel/neqjoinsel) are excluded via
!is_join_clause
> For the MCVs, can't we reuse some code from the eqjoinsel() optimization
> we did? The entry and context structs look similar enough to only need one.
I considered reusing pieces from the eqjoinsel() , but in practice it
turned out to be difficult to share code cleanly. Also, when looking at
this file more broadly, we already have multiple places that reimplement
similar pattern.
> Making the code more compact would ease reviewing a lot.
Agreed — I also think making the code more compact would significantly
ease reviewing. I’ve found a way to unify the Const-array and ArrayExpr
cases: in the ArrayExpr path, we can first construct the same arrays as
in the Const-array case (elem_values, elem_nulls), and additionally
build a boolean array elem_const[] indicating whether each element is a
Const. Then the hash-based MCV matching function can:
- Ignore NULL and non-Const elements when building and probing the hash
table.
- Count how many non-Const elements are present.
- After MCV and non-MCV constant handling, account for non-Const
elements separately using var_eq_non_const() and fold their
probabilities into the same ANY/ALL accumulation logic.
I've attached v3 patch with it.
To validate the same estimation results, I temporarily kept both
implementations (hash-based and nested-loop) and compared their
resulting selectivity values. Whenever they differed, I logged it. I ran
regression tests and some local workload testing with this check
enabled, and did not observe any mismatches. I attached patch with this
logging.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
[text/x-patch] hash_mcv_in_logging.patch (1.5K, 3-hash_mcv_in_logging.patch)
download | inline diff:
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f6d99773b70..f4c7fb605ce 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1966,7 +1966,7 @@ scalararraysel(PlannerInfo *root,
TypeCacheEntry *typentry;
RegProcedure oprsel;
FmgrInfo oprselproc;
- Selectivity s1;
+ Selectivity s1, hash_selec = -1.0;
Selectivity s1disjoint;
/* First, deconstruct the expression */
@@ -2106,7 +2106,7 @@ scalararraysel(PlannerInfo *root,
ReleaseVariableStats(vardata);
if (s1 >= 0.0)
- return s1;
+ hash_selec = s1;
}
}
@@ -2173,6 +2173,9 @@ scalararraysel(PlannerInfo *root,
if ((useOr ? isEquality : isInequality) &&
s1disjoint >= 0.0 && s1disjoint <= 1.0)
s1 = s1disjoint;
+
+ if (hash_selec >= 0 && fabs(s1 - hash_selec) > 0.000000000001)
+ elog(LOG, "nested_loop selectivity = %f, hash_loop selectivity = %f", s1, hash_selec);
}
else if (rightop && IsA(rightop, ArrayExpr) &&
!((ArrayExpr *) rightop)->multidims)
@@ -2251,7 +2254,7 @@ scalararraysel(PlannerInfo *root,
ReleaseVariableStats(vardata);
if (s1 >= 0.0)
- return s1;
+ hash_selec = s1;
}
}
@@ -2310,6 +2313,9 @@ scalararraysel(PlannerInfo *root,
if ((useOr ? isEquality : isInequality) &&
s1disjoint >= 0.0 && s1disjoint <= 1.0)
s1 = s1disjoint;
+
+ if (hash_selec >= 0 && fabs(s1 - hash_selec) > 0.000000000001)
+ elog(LOG, "nested_loop selectivity = %f, hash_loop selectivity = %f", s1, hash_selec);
}
else
{
[text/x-patch] v3-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch (21.4K, 4-v3-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch)
download | inline diff:
From d439674b3752dd7862bcbb132db2dc91996b1a3c Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <[email protected]>
Date: Tue, 27 Jan 2026 18:24:50 +0300
Subject: [PATCH v3] Use hash-based MCV matching for ScalarArrayOpExpr
selectivity
When estimating selectivity for ScalarArrayOpExpr (IN / ANY / ALL) with
available MCV statistics, the planner currently matches IN-list elements
against the MCV array using nested loops. For large IN-lists and/or large
MCV lists this leads to O(N*M) planning-time behavior.
This patch adds a hash-based matching strategy, similar to the one used
in join selectivity estimation. When MCV statistics are available and the
operator supports hashing, the smaller of the two inputs (MCV list or
IN-list constant elements) is chosen as the hash table build side, and
the other side is scanned once, reducing complexity to O(N+M).
The hash-based path is restricted to equality and inequality operators
that use eqsel()/neqsel(), and is applied only when suitable hash
functions and MCV statistics are available.
---
src/backend/utils/adt/selfuncs.c | 588 ++++++++++++++++++++++++++++++-
src/tools/pgindent/typedefs.list | 3 +
2 files changed, 590 insertions(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 29fec655593..f6d99773b70 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -146,7 +146,7 @@
/*
* In production builds, switch to hash-based MCV matching when the lists are
* large enough to amortize hash setup cost. (This threshold is compared to
- * the sum of the lengths of the two MCV lists. This is simplistic but seems
+ * the sum of the lengths of the two lists. This is simplistic but seems
* to work well enough.) In debug builds, we use a smaller threshold so that
* the regression tests cover both paths well.
*/
@@ -156,6 +156,12 @@
#define EQJOINSEL_MCV_HASH_THRESHOLD 20
#endif
+#ifndef USE_ASSERT_CHECKING
+#define SCALARARRAY_MCV_HASH_THRESHOLD 200
+#else
+#define SCALARARRAY_MCV_HASH_THRESHOLD 20
+#endif
+
/* Entries in the simplehash hash table used by eqjoinsel_find_matches */
typedef struct MCVHashEntry
{
@@ -176,14 +182,46 @@ typedef struct MCVHashContext
int16 hash_typlen; /* typlen of hashed data type */
} MCVHashContext;
+/* Entries in the simplehash hash table used by scalararray_mcv_hash_match */
+typedef struct MCVInHashEntry
+{
+ Datum value; /* the value represented by this entry */
+ int index; /* its index in the relevant AttStatsSlot */
+ uint32 hash; /* hash code for the Datum */
+ char status; /* status code used by simplehash.h */
+ int count; /* number of occurrences od current value in
+ * the IN-list */
+} MCVInHashEntry;
+
+/* private_data for the simplehash hash table */
+typedef struct MCVInHashContext
+{
+ FunctionCallInfo equal_fcinfo; /* the equality join operator */
+ FunctionCallInfo hash_fcinfo; /* the hash function to use */
+ bool insert_mode; /* doing inserts or lookups? */
+ bool hash_typbyval; /* typbyval of hashed data type */
+ int16 hash_typlen; /* typlen of hashed data type */
+} MCVInHashContext;
+
/* forward reference */
typedef struct MCVHashTable_hash MCVHashTable_hash;
+typedef struct MCVInHashTable_hash MCVInHashTable_hash;
/* Hooks for plugins to get control when we ask for stats */
get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
+static double scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left, Datum *elem_values,
+ bool *elem_nulls, int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality);
+static void accum_scalararray_prob(Selectivity s1, bool useOr, bool isEquality,
+ bool isInequality, double nullfrac,
+ double *selec, double *s1disjoint);
+static uint32 hash_mcv_in(MCVInHashTable_hash *tab, Datum key);
+static bool mcvs_in_equal(MCVInHashTable_hash *tab, Datum key0, Datum key1);
static double eqjoinsel_inner(FmgrInfo *eqproc, Oid collation,
Oid hashLeft, Oid hashRight,
VariableStatData *vardata1, VariableStatData *vardata2,
@@ -287,6 +325,19 @@ static double btcost_correlation(IndexOptInfo *index,
#define SH_DECLARE
#include "lib/simplehash.h"
+#define SH_PREFIX MCVInHashTable
+#define SH_ELEMENT_TYPE MCVInHashEntry
+#define SH_KEY_TYPE Datum
+#define SH_KEY value
+#define SH_HASH_KEY(tab,key) hash_mcv_in(tab, key)
+#define SH_EQUAL(tab,key0,key1) mcvs_in_equal(tab, key0, key1)
+#define SH_SCOPE static inline
+#define SH_STORE_HASH
+#define SH_GET_HASH(tab,ent) (ent)->hash
+#define SH_DEFINE
+#define SH_DECLARE
+#include "lib/simplehash.h"
+
/*
* eqsel - Selectivity of "=" for any data types.
@@ -2025,6 +2076,40 @@ scalararraysel(PlannerInfo *root,
elmlen, elmbyval, elmalign,
&elem_values, &elem_nulls, &num_elems);
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching. We use hash-search only for eqsel() and
+ * neqsel().
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then punt and return a default estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ bool *elem_const = (bool *) palloc(sizeof(bool) * num_elems);
+
+ /* all array elements are Const nodes */
+ memset(elem_const, true, sizeof(bool) * num_elems);
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator, clause->inputcollid, other_op, var_on_left,
+ elem_values, elem_nulls, num_elems, elem_const,
+ nominal_element_type, useOr, isEquality, isInequality);
+ pfree(elem_const);
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* For generic operators, we assume the probability of success is
* independent for each array element. But for "= ANY" or "<> ALL",
@@ -2100,6 +2185,76 @@ scalararraysel(PlannerInfo *root,
get_typlenbyval(arrayexpr->element_typeid,
&elmlen, &elmbyval);
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching. We use hash-search only for eqsel() and
+ * neqsel().
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then punt and return a default estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ int num_elems;
+ Datum *elem_values;
+ bool *elem_nulls;
+ bool *elem_const;
+ ListCell *lc;
+ int i = 0;
+
+ num_elems = list_length(arrayexpr->elements);
+ elem_values = (Datum *) palloc0(sizeof(Datum) * num_elems);
+ elem_nulls = (bool *) palloc0(sizeof(bool) * num_elems);
+ elem_const = (bool *) palloc0(sizeof(bool) * num_elems);
+
+ /*
+ * Build arrays describing ARRAY[] elements: - elem_values:
+ * Datum value for Const elements - elem_nulls: whether
+ * element is NULL - elem_const: whether element is a Const
+ * node
+ */
+ foreach(lc, arrayexpr->elements)
+ {
+ Node *elem_value = (Node *) lfirst(lc);
+
+ if (IsA(elem_value, Const))
+ {
+ elem_values[i] = ((Const *) elem_value)->constvalue;
+ elem_nulls[i] = ((Const *) elem_value)->constisnull;
+ elem_const[i] = true;
+ }
+ else
+ {
+ elem_nulls[i] = false;
+ elem_const[i] = false;
+ }
+
+ i++;
+ }
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator, clause->inputcollid, other_op, var_on_left,
+ elem_values, elem_nulls, num_elems, elem_const,
+ nominal_element_type, useOr, isEquality, isInequality);
+
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* We use the assumption of disjoint probabilities here too, although
* the odds of equal array elements are rather higher if the elements
@@ -2210,6 +2365,437 @@ scalararraysel(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity of a ScalarArrayOpExpr (ANY/ALL) using MCV statistics
+ * with hash-based matching.
+ *
+ * This function follows the same probability model as the generic
+ * ScalarArrayOpExpr selectivity code (independent or disjoint probabilities
+ * for OR/AND combinations), but attempts to speed up matching between
+ * IN-list elements and the column's most-common-values (MCV) statistics by
+ * using hashing instead of nested loops.
+ *
+ * MCV statistics are used only to obtain per-value selectivities for
+ * constants that match MCV entries. All probabilities are combined using
+ * the standard ANY/ALL formulas, exactly as in the generic estimator.
+ *
+ * The function may return -1.0 to indicate that hash-based MCV estimation
+ * is not applicable (for example, missing statistics, unsupported operator,
+ * or unavailable hash functions), in which case the caller should fall back
+ * to the generic ScalarArrayOpExpr selectivity estimation.
+ *
+ * Inputs:
+ * vardata: statistics and metadata for the variable being estimated
+ * operator: equality or inequality operator to apply
+ * collation: OID of collation to use
+ * other_op: expression for the non-variable side of the comparison
+ * var_on_left: true if the variable is on the left side of the operator
+ * elem_values: array of IN-list element values
+ * elem_nulls: array indicating which IN-list elements are NULL
+ * elem_const: array indicating which IN-list elements are Const nodes
+ * num_elems: number of IN-list elements
+ * nominal_element_type: type of IN-list elements
+ * useOr: true if elements are combined using OR semantics, false for AND
+ * isEquality: true if the operator behaves like equality
+ * isInequality: true if the operator behaves like inequality
+ *
+ * Result:
+ * Selectivity estimate in the range [0.0, 1.0], or -1.0 if no estimate
+ * could be produced by this function.
+ *
+ * Note:
+ * This function assumes that the operator’s selectivity behavior matches
+ * eqsel()/neqsel semantics. It must not be used for operators with custom
+ * or non-standard selectivity behavior.
+ */
+static double
+scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left,
+ Datum *elem_values, bool *elem_nulls, int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+
+ /*
+ * If the variable is known to be unique, MCV statistics do not represent
+ * a meaningful frequency distribution, so skip MCV-based estimation.
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * For inequality (<>, ALL), we compute probabilities using the negated
+ * equality operator and later transform them as
+ *
+ * p(x <> c) = 1 - p(x = c) - nullfrac
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+
+ /*
+ * f the MCV list and IN-list are large enough, and the operator
+ * supports hashing, attempt to use hash functions so that MCV–IN
+ * matching can be done in O(N+M) instead of O(N×M).
+ */
+ if (sslot.nvalues + num_elems >= SCALARARRAY_MCV_HASH_THRESHOLD)
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVInHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVInHashContext hashContext;
+ double sumallcommon = 0.0,
+ nonmcv_selec = 0.0;
+ bool isdefault;
+ bool hash_mcv;
+ double otherdistinct;
+ Datum *arrayHash;
+ Datum *arrayProbe;
+ int nvaluesHash;
+ int nvaluesProbe;
+ int nvaluesnonmcv = num_elems;
+ int nvaluesnonconst = 0;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ sumallcommon += sslot.numbers[i];
+
+ /*
+ * Compute the total probability mass of all non-MCV values. This is
+ * the part of the column distribution not covered by MCVs.
+ */
+ nonmcv_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(nonmcv_selec);
+
+ /*
+ * Approximate the per-value probability of a non-MCV constant by
+ * dividing the remaining probability mass by the number of other
+ * distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ nonmcv_selec /= otherdistinct;
+
+ if (sslot.nnumbers > 0 && nonmcv_selec > sslot.numbers[sslot.nnumbers - 1])
+ nonmcv_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /* Make sure we build the hash table on the smaller array. */
+ if (sslot.nvalues <= num_elems)
+ {
+ hash_mcv = true;
+ nvaluesHash = sslot.nvalues;
+ nvaluesProbe = num_elems;
+ arrayHash = sslot.values;
+ arrayProbe = elem_values;
+ }
+ else
+ {
+ hash_mcv = false;
+ nvaluesHash = num_elems;
+ nvaluesProbe = sslot.nvalues;
+ arrayHash = elem_values;
+ arrayProbe = sslot.values;
+ }
+
+ fmgr_info(hash_mcv ? hashLeft : hashRight, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(hash_mcv ? sslot.valuetype : nominal_element_type,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVInHashTable_create(CurrentMemoryContext,
+ nvaluesHash,
+ &hashContext);
+
+ /* Build a hash table over the smaller input side. */
+ for (int i = 0; i < nvaluesHash; i++)
+ {
+ bool found = false;
+ MCVInHashEntry *entry;
+
+ /*
+ * When hashing IN-list values (hash_mcv == false), we only insert
+ * constant, non-NULL elements. NULL and non-Const elements are
+ * counted separately, because they cannot participate in MCV
+ * matching and must be handled later using generic selectivity
+ * estimation.
+ */
+ if (!hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ nvaluesnonmcv--;
+ continue;
+ }
+
+ if (!elem_const[i])
+ {
+ nvaluesnonmcv--;
+ nvaluesnonconst++;
+ continue;
+ }
+ }
+
+ entry = MCVInHashTable_insert(hashTable, arrayHash[i], &found);
+
+ /*
+ * entry->count tracks how many times the same value appears, so
+ * that duplicate IN-list elements can be folded into the
+ * probability calculation.
+ */
+ if (likely(!found))
+ {
+ entry->index = i;
+ entry->count = 1;
+ }
+ else
+ entry->count++;
+ }
+
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hash_mcv ? hashRight : hashLeft, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ for (int i = 0; i < nvaluesProbe; i++)
+ {
+ MCVInHashEntry *entry;
+ Selectivity s1;
+ int nvaluesmcv;
+
+ /*
+ * When probing with IN-list elements, ignore NULLs and non-Const
+ * expressions: they cannot be matched against MCVs and will be
+ * accounted for later by generic estimation.
+ */
+ if (hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ nvaluesnonmcv--;
+ continue;
+ }
+
+ if (!elem_const[i])
+ {
+ nvaluesnonmcv--;
+ nvaluesnonconst++;
+ continue;
+ }
+ }
+
+ entry = MCVInHashTable_lookup(hashTable, arrayProbe[i]);
+
+ /*
+ * If found, obtain its MCV frequency and remember how many values
+ * on the hashed side map to this entry.
+ */
+ if (entry != NULL)
+ {
+ s1 = hash_mcv ? sslot.numbers[entry->index]
+ : sslot.numbers[i];
+
+ nvaluesmcv = entry->count;
+
+ /* Matched values are no longer considered non-MCV */
+ nvaluesnonmcv -= nvaluesmcv;
+ }
+ else
+ {
+ /* No MCV match for this value */
+ continue;
+ }
+
+ /*
+ * Fold this value's probability into the running ANY/ALL
+ * selectivity estimate once for each occurrence.
+ */
+ for (int j = 0; j < nvaluesmcv; j++)
+ accum_scalararray_prob(s1, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+ }
+
+ /*
+ * Account for constant IN-list values that did not match any MCV.
+ *
+ * Each such value is assumed to have probability = nonmcv_selec,
+ * derived from the remaining (non-MCV) probability mass.
+ */
+ for (int i = 0; i < nvaluesnonmcv; i++)
+ accum_scalararray_prob(nonmcv_selec, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+
+ /*
+ * Account for non-Const IN-list elements.
+ *
+ * These values cannot be matched against MCVs, so we rely on the
+ * operator's generic selectivity estimator for each of them.
+ */
+ for (int i = 0; i < nvaluesnonconst; i++)
+ {
+ Selectivity s1 = var_eq_non_const(vardata, operator, collation,
+ other_op, var_on_left, isInequality);
+
+ accum_scalararray_prob(s1, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+ }
+
+ /*
+ * For = ANY or <> ALL, if the IN-list elements are assumed distinct,
+ * the events are disjoint and the total probability is the sum of
+ * individual probabilities. Use that estimate if it lies in [0,1].
+ */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+
+ CLAMP_PROBABILITY(selec);
+
+ MCVInHashTable_destroy(hashTable);
+ free_attstatsslot(&sslot);
+ }
+
+ return selec;
+}
+
+/*
+ * Accumulate the selectivity contribution of a single array element
+ * into the running ScalarArrayOpExpr selectivity estimate.
+ */
+static void
+accum_scalararray_prob(Selectivity s1,
+ bool useOr,
+ bool isEquality,
+ bool isInequality,
+ double nullfrac,
+ double *selec,
+ double *s1disjoint)
+{
+ Selectivity s2 = s1;
+
+ if (isInequality)
+ s2 = 1.0 - s2 - nullfrac;
+
+ CLAMP_PROBABILITY(s2);
+
+ if (useOr)
+ {
+ *selec = *selec + s2 - (*selec) * s2;
+ if (isEquality)
+ *s1disjoint += s2;
+ }
+ else
+ {
+ *selec = (*selec) * s2;
+ if (isInequality)
+ *s1disjoint += s2 - 1.0;
+ }
+}
+
+/*
+ * Support functions for the hash tables used by eqjoinsel_find_matches
+ */
+static uint32
+hash_mcv_in(MCVInHashTable_hash *tab, Datum key)
+{
+ MCVInHashContext *context = (MCVInHashContext *) tab->private_data;
+ FunctionCallInfo fcinfo = context->hash_fcinfo;
+ Datum fresult;
+
+ fcinfo->args[0].value = key;
+ fcinfo->isnull = false;
+ fresult = FunctionCallInvoke(fcinfo);
+ Assert(!fcinfo->isnull);
+ return DatumGetUInt32(fresult);
+}
+
+static bool
+mcvs_in_equal(MCVInHashTable_hash *tab, Datum key0, Datum key1)
+{
+ MCVInHashContext *context = (MCVInHashContext *) tab->private_data;
+
+ if (context->insert_mode)
+ {
+ /*
+ * During the insertion step, any comparisons will be between two
+ * Datums of the hash table's data type, so if the given operator is
+ * cross-type it will be the wrong thing to use. Fortunately, we can
+ * use datum_image_eq instead. The MCV values should all be distinct
+ * anyway, so it's mostly pro-forma to compare them at all.
+ */
+ return datum_image_eq(key0, key1,
+ context->hash_typbyval, context->hash_typlen);
+ }
+ else
+ {
+ FunctionCallInfo fcinfo = context->equal_fcinfo;
+ Datum fresult;
+
+ fcinfo->args[0].value = key0;
+ fcinfo->args[1].value = key1;
+ fcinfo->isnull = false;
+ fresult = FunctionCallInvoke(fcinfo);
+ return (!fcinfo->isnull && DatumGetBool(fresult));
+ }
+}
+
/*
* Estimate number of elements in the array yielded by an expression.
*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index ddbe4c64971..48b886c2ae4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1671,6 +1671,9 @@ MBuf
MCVHashContext
MCVHashEntry
MCVHashTable_hash
+MCVInHashContext
+MCVInHashEntry
+MCVInHashTable_hash
MCVItem
MCVList
MEMORY_BASIC_INFORMATION
--
2.34.1
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-01-28 13:08 Chengpeng Yan <[email protected]>
parent: Ilia Evdokimov <[email protected]>
1 sibling, 1 reply; 10+ messages in thread
From: Chengpeng Yan @ 2026-01-28 13:08 UTC (permalink / raw)
To: Ilia Evdokimov <[email protected]>; +Cc: David Geier <[email protected]>; [email protected] <[email protected]>
> On Jan 14, 2026, at 18:19, Ilia Evdokimov <[email protected]> wrote:
> I’ve attached v2 of the patch. It currently uses two fairly large helper functions for the Const and ArrayExpr cases; this is intentional to keep the logic explicit and reviewable, even though these will likely need refactoring or consolidation later.
Thanks for working on this.
I had previously reviewed the v2 patch and wrote up some comments, but
didn’t get a chance to send them before v3 was posted. I haven’t yet had
time to review v3 in detail, so I’m not sure whether the issues below
have already been addressed there. I’m posting my earlier review notes
first and will follow up with comments on v3 once I’ve had a chance to
look at it.
* Treat NULL array elements as zero selectivity for ALL:
In `scalararray_mcv_hash_match_const()` (and similarly
`scalararray_mcv_hash_match_expr()`), NULL array elements are currently
handled by simply continuing the loop (e.g. `if (elem_nulls[i])
continue;`), effectively ignoring them.
This behavior is only correct for ANY/OR semantics. For ALL/AND (`useOr
= false`), a single NULL array element causes the `ScalarArrayOpExpr` to
never return TRUE for strict operators (as assumed by the surrounding
code and comments). In that case, the correct selectivity estimate
should be 0.0, but the current code path can return a non-zero
selectivity.
* Fix cross-type equality argument order in `mcvs_in_equal`:
`mcvs_in_equal()` always invokes the equality function as `(key0,
key1)`. However, `simplehash` provides `key0` from the hash table and
`key1` as the probe key.
In the branch where the hash table is built over IN-list values and
probed with MCVs (the `sslot.nvalues > num_elems` path), this reverses
the operator’s argument order for cross-type equality operators. This
risks incorrect match decisions and may misinterpret Datums compared to
the operator’s declared signature.
* Include non-MCV IN-list constants in non-disjoint selectivity:
In the `sslot.nvalues > num_elems` path of
`scalararray_mcv_hash_match_const()` and
`scalararray_mcv_hash_match_expr()`, non-MCV constant elements currently
only contribute via `disjoint_sel`.
For cases where disjoint-probability estimation is not used (e.g. ALL,
`<> ANY`, or when `disjoint_sel` is out of range), the code leaves the
selectivity based solely on MCV matches. This effectively treats non-MCV
constants as having probability 1.0, leading to overestimation of
selectivity.
* Avoid double-negating inequality estimates for non-Const elements:
In the `scalararray_mcv_hash_match_expr()` `sslot.nvalues > num_elems`
branch, non-Const elements are handled via
`var_eq_non_const(..., negate = isInequality)`
and then later adjusted again with
`if (isInequality)
s1 = 1.0 - s1 - nullfrac;`
This results in a double negation for inequality cases, effectively
turning the estimate back into an equality selectivity.
--
Best regards,
Chengpeng Yan
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-01-29 11:14 Ilia Evdokimov <[email protected]>
parent: Chengpeng Yan <[email protected]>
0 siblings, 1 reply; 10+ messages in thread
From: Ilia Evdokimov @ 2026-01-29 11:14 UTC (permalink / raw)
To: Chengpeng Yan <[email protected]>; +Cc: David Geier <[email protected]>; [email protected] <[email protected]>
Hi Chengpeng,
Thanks for your review!
On 28.01.2026 16:08, Chengpeng Yan wrote:
> * Treat NULL array elements as zero selectivity for ALL:
Agreed. For ALL/AND semantics the function now returns selectivity = 0.0
as soon as a NULL element is encountered.
> * Fix cross-type equality argument order in `mcvs_in_equal`:
Agreed. Added 'op_is_reserved' flag MCVInHashContext, same as in
MCVHashContext.
> * Include non-MCV IN-list constants in non-disjoint selectivity:
This is not applicable to v3.
> * Avoid double-negating inequality estimates for non-Const elements:
Agreed. var_eq_non_const() is now always with negate = false, not to
call negation twice.
Attached v4 patch with above fixes.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
[text/x-patch] v4-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch (22.2K, 3-v4-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch)
download | inline diff:
From c0f43088100b2bb80d3cc35c7a0bcd663a079bf0 Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <[email protected]>
Date: Thu, 29 Jan 2026 14:12:23 +0300
Subject: [PATCH v4] Use hash-based MCV matching for ScalarArrayOpExpr
selectivity
When estimating selectivity for ScalarArrayOpExpr (IN / ANY / ALL) with
available MCV statistics, the planner currently matches IN-list elements
against the MCV array using nested loops. For large IN-lists and/or large
MCV lists this leads to O(N*M) planning-time behavior.
This patch adds a hash-based matching strategy, similar to the one used
in join selectivity estimation. When MCV statistics are available and the
operator supports hashing, the smaller of the two inputs (MCV list or
IN-list constant elements) is chosen as the hash table build side, and
the other side is scanned once, reducing complexity to O(N+M).
The hash-based path is restricted to equality and inequality operators
that use eqsel()/neqsel(), and is applied only when suitable hash
functions and MCV statistics are available.
---
src/backend/utils/adt/selfuncs.c | 624 ++++++++++++++++++++++++++++++-
src/tools/pgindent/typedefs.list | 3 +
2 files changed, 626 insertions(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 29fec655593..09ea733aec1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -146,7 +146,7 @@
/*
* In production builds, switch to hash-based MCV matching when the lists are
* large enough to amortize hash setup cost. (This threshold is compared to
- * the sum of the lengths of the two MCV lists. This is simplistic but seems
+ * the sum of the lengths of the two lists. This is simplistic but seems
* to work well enough.) In debug builds, we use a smaller threshold so that
* the regression tests cover both paths well.
*/
@@ -156,6 +156,12 @@
#define EQJOINSEL_MCV_HASH_THRESHOLD 20
#endif
+#ifndef USE_ASSERT_CHECKING
+#define SCALARARRAY_MCV_HASH_THRESHOLD 200
+#else
+#define SCALARARRAY_MCV_HASH_THRESHOLD 20
+#endif
+
/* Entries in the simplehash hash table used by eqjoinsel_find_matches */
typedef struct MCVHashEntry
{
@@ -176,14 +182,47 @@ typedef struct MCVHashContext
int16 hash_typlen; /* typlen of hashed data type */
} MCVHashContext;
+/* Entries in the simplehash hash table used by scalararray_mcv_hash_match */
+typedef struct MCVInHashEntry
+{
+ Datum value; /* the value represented by this entry */
+ int index; /* its index in the relevant AttStatsSlot */
+ uint32 hash; /* hash code for the Datum */
+ char status; /* status code used by simplehash.h */
+ int count; /* number of occurrences od current value in
+ * the IN-list */
+} MCVInHashEntry;
+
+/* private_data for the simplehash hash table */
+typedef struct MCVInHashContext
+{
+ FunctionCallInfo equal_fcinfo; /* the equality join operator */
+ FunctionCallInfo hash_fcinfo; /* the hash function to use */
+ bool op_is_reversed; /* equality compares hash type to probe type */
+ bool insert_mode; /* doing inserts or lookups? */
+ bool hash_typbyval; /* typbyval of hashed data type */
+ int16 hash_typlen; /* typlen of hashed data type */
+} MCVInHashContext;
+
/* forward reference */
typedef struct MCVHashTable_hash MCVHashTable_hash;
+typedef struct MCVInHashTable_hash MCVInHashTable_hash;
/* Hooks for plugins to get control when we ask for stats */
get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
+static double scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left, Datum *elem_values,
+ bool *elem_nulls, int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality);
+static void accum_scalararray_prob(Selectivity s1, bool useOr, bool isEquality,
+ bool isInequality, double nullfrac,
+ double *selec, double *s1disjoint);
+static uint32 hash_mcv_in(MCVInHashTable_hash *tab, Datum key);
+static bool mcvs_in_equal(MCVInHashTable_hash *tab, Datum key0, Datum key1);
static double eqjoinsel_inner(FmgrInfo *eqproc, Oid collation,
Oid hashLeft, Oid hashRight,
VariableStatData *vardata1, VariableStatData *vardata2,
@@ -287,6 +326,19 @@ static double btcost_correlation(IndexOptInfo *index,
#define SH_DECLARE
#include "lib/simplehash.h"
+#define SH_PREFIX MCVInHashTable
+#define SH_ELEMENT_TYPE MCVInHashEntry
+#define SH_KEY_TYPE Datum
+#define SH_KEY value
+#define SH_HASH_KEY(tab,key) hash_mcv_in(tab, key)
+#define SH_EQUAL(tab,key0,key1) mcvs_in_equal(tab, key0, key1)
+#define SH_SCOPE static inline
+#define SH_STORE_HASH
+#define SH_GET_HASH(tab,ent) (ent)->hash
+#define SH_DEFINE
+#define SH_DECLARE
+#include "lib/simplehash.h"
+
/*
* eqsel - Selectivity of "=" for any data types.
@@ -2025,6 +2077,40 @@ scalararraysel(PlannerInfo *root,
elmlen, elmbyval, elmalign,
&elem_values, &elem_nulls, &num_elems);
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching. We use hash-search only for eqsel() and
+ * neqsel().
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then punt and return a default estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ bool *elem_const = (bool *) palloc(sizeof(bool) * num_elems);
+
+ /* all array elements are Const nodes */
+ memset(elem_const, true, sizeof(bool) * num_elems);
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator, clause->inputcollid, other_op, var_on_left,
+ elem_values, elem_nulls, num_elems, elem_const,
+ nominal_element_type, useOr, isEquality, isInequality);
+ pfree(elem_const);
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* For generic operators, we assume the probability of success is
* independent for each array element. But for "= ANY" or "<> ALL",
@@ -2100,6 +2186,76 @@ scalararraysel(PlannerInfo *root,
get_typlenbyval(arrayexpr->element_typeid,
&elmlen, &elmbyval);
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching. We use hash-search only for eqsel() and
+ * neqsel().
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then punt and return a default estimate.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ int num_elems;
+ Datum *elem_values;
+ bool *elem_nulls;
+ bool *elem_const;
+ ListCell *lc;
+ int i = 0;
+
+ num_elems = list_length(arrayexpr->elements);
+ elem_values = (Datum *) palloc0(sizeof(Datum) * num_elems);
+ elem_nulls = (bool *) palloc0(sizeof(bool) * num_elems);
+ elem_const = (bool *) palloc0(sizeof(bool) * num_elems);
+
+ /*
+ * Build arrays describing ARRAY[] elements: - elem_values:
+ * Datum value for Const elements - elem_nulls: whether
+ * element is NULL - elem_const: whether element is a Const
+ * node
+ */
+ foreach(lc, arrayexpr->elements)
+ {
+ Node *elem_value = (Node *) lfirst(lc);
+
+ if (IsA(elem_value, Const))
+ {
+ elem_values[i] = ((Const *) elem_value)->constvalue;
+ elem_nulls[i] = ((Const *) elem_value)->constisnull;
+ elem_const[i] = true;
+ }
+ else
+ {
+ elem_nulls[i] = false;
+ elem_const[i] = false;
+ }
+
+ i++;
+ }
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator, clause->inputcollid, other_op, var_on_left,
+ elem_values, elem_nulls, num_elems, elem_const,
+ nominal_element_type, useOr, isEquality, isInequality);
+
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* We use the assumption of disjoint probabilities here too, although
* the odds of equal array elements are rather higher if the elements
@@ -2210,6 +2366,472 @@ scalararraysel(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity of a ScalarArrayOpExpr (ANY/ALL) using MCV statistics
+ * with hash-based matching.
+ *
+ * This function follows the same probability model as the generic
+ * ScalarArrayOpExpr selectivity code (independent or disjoint probabilities
+ * for OR/AND combinations), but attempts to speed up matching between
+ * IN-list elements and the column's most-common-values (MCV) statistics by
+ * using hashing instead of nested loops.
+ *
+ * MCV statistics are used only to obtain per-value selectivities for
+ * constants that match MCV entries. All probabilities are combined using
+ * the standard ANY/ALL formulas, exactly as in the generic estimator.
+ *
+ * The function may return -1.0 to indicate that hash-based MCV estimation
+ * is not applicable (for example, missing statistics, unsupported operator,
+ * or unavailable hash functions), in which case the caller should fall back
+ * to the generic ScalarArrayOpExpr selectivity estimation.
+ *
+ * Inputs:
+ * vardata: statistics and metadata for the variable being estimated
+ * operator: equality or inequality operator to apply
+ * collation: OID of collation to use
+ * other_op: expression for the non-variable side of the comparison
+ * var_on_left: true if the variable is on the left side of the operator
+ * elem_values: array of IN-list element values
+ * elem_nulls: array indicating which IN-list elements are NULL
+ * elem_const: array indicating which IN-list elements are Const nodes
+ * num_elems: number of IN-list elements
+ * nominal_element_type: type of IN-list elements
+ * useOr: true if elements are combined using OR semantics, false for AND
+ * isEquality: true if the operator behaves like equality
+ * isInequality: true if the operator behaves like inequality
+ *
+ * Result:
+ * Selectivity estimate in the range [0.0, 1.0], or -1.0 if no estimate
+ * could be produced by this function.
+ *
+ * Note:
+ * This function assumes that the operator’s selectivity behavior matches
+ * eqsel()/neqsel semantics. It must not be used for operators with custom
+ * or non-standard selectivity behavior.
+ */
+static double
+scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left,
+ Datum *elem_values, bool *elem_nulls, int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+
+ /*
+ * If the variable is known to be unique, MCV statistics do not represent
+ * a meaningful frequency distribution, so skip MCV-based estimation.
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * For inequality (<>, ALL), we compute probabilities using the negated
+ * equality operator and later transform them as
+ *
+ * p(x <> c) = 1 - p(x = c) - nullfrac
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+
+ /*
+ * If the MCV list and IN-list are large enough, and the operator
+ * supports hashing, attempt to use hash functions so that MCV–IN
+ * matching can be done in O(N+M) instead of O(N×M).
+ */
+ if (sslot.nvalues + num_elems >= SCALARARRAY_MCV_HASH_THRESHOLD)
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVInHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVInHashContext hashContext;
+ double sumallcommon = 0.0,
+ nonmcv_selec = 0.0;
+ bool isdefault;
+ bool hash_mcv;
+ double otherdistinct;
+ Datum *arrayHash;
+ Datum *arrayProbe;
+ int nvaluesHash;
+ int nvaluesProbe;
+ int nvalues_non_mcv = num_elems;
+ int nvalues_nonconst = 0;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ sumallcommon += sslot.numbers[i];
+
+ /*
+ * Compute the total probability mass of all non-MCV values. This is
+ * the part of the column distribution not covered by MCVs.
+ */
+ nonmcv_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(nonmcv_selec);
+
+ /*
+ * Approximate the per-value probability of a non-MCV constant by
+ * dividing the remaining probability mass by the number of other
+ * distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ nonmcv_selec /= otherdistinct;
+
+ if (sslot.nnumbers > 0 && nonmcv_selec > sslot.numbers[sslot.nnumbers - 1])
+ nonmcv_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /* Make sure we build the hash table on the smaller array. */
+ if (sslot.nvalues <= num_elems)
+ {
+ hash_mcv = true;
+ nvaluesHash = sslot.nvalues;
+ nvaluesProbe = num_elems;
+ arrayHash = sslot.values;
+ arrayProbe = elem_values;
+ }
+ else
+ {
+ hash_mcv = false;
+ nvaluesHash = num_elems;
+ nvaluesProbe = sslot.nvalues;
+ arrayHash = elem_values;
+ arrayProbe = sslot.values;
+ }
+
+ fmgr_info(hash_mcv ? hashLeft : hashRight, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.op_is_reversed = !hash_mcv;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(hash_mcv ? sslot.valuetype : nominal_element_type,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVInHashTable_create(CurrentMemoryContext,
+ nvaluesHash,
+ &hashContext);
+
+ /* Build a hash table over the smaller input side. */
+ for (int i = 0; i < nvaluesHash; i++)
+ {
+ bool found = false;
+ MCVInHashEntry *entry;
+
+ /*
+ * When hashing IN-list values (hash_mcv == false), we only insert
+ * constant, non-NULL elements. NULL and non-Const elements are
+ * counted separately, because they cannot participate in MCV
+ * matching and must be handled later using generic selectivity
+ * estimation.
+ */
+ if (!hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ /*
+ * For ALL/AND semantics, any NULL element makes result
+ * always FALSE
+ */
+ if (!useOr)
+ {
+ MCVInHashTable_destroy(hashTable);
+ free_attstatsslot(&sslot);
+
+ return 0.0;
+ }
+
+ nvalues_non_mcv--;
+ continue;
+ }
+
+ if (!elem_const[i])
+ {
+ nvalues_non_mcv--;
+ nvalues_nonconst++;
+ continue;
+ }
+ }
+
+ entry = MCVInHashTable_insert(hashTable, arrayHash[i], &found);
+
+ /*
+ * entry->count tracks how many times the same value appears, so
+ * that duplicate IN-list elements can be folded into the
+ * probability calculation.
+ */
+ if (likely(!found))
+ {
+ entry->index = i;
+ entry->count = 1;
+ }
+ else
+ entry->count++;
+ }
+
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hash_mcv ? hashRight : hashLeft, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ for (int i = 0; i < nvaluesProbe; i++)
+ {
+ MCVInHashEntry *entry;
+ Selectivity s1;
+ int nvaluesmcv;
+
+ /*
+ * When probing with IN-list elements, ignore NULLs and non-Const
+ * expressions: they cannot be matched against MCVs and will be
+ * accounted for later by generic estimation.
+ */
+ if (hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ /*
+ * For ALL/AND semantics, any NULL element makes result
+ * always FALSE
+ */
+ if (!useOr)
+ {
+ MCVInHashTable_destroy(hashTable);
+ free_attstatsslot(&sslot);
+
+ return 0.0;
+ }
+
+ nvalues_non_mcv--;
+ continue;
+ }
+
+ if (!elem_const[i])
+ {
+ nvalues_non_mcv--;
+ nvalues_nonconst++;
+ continue;
+ }
+ }
+
+ entry = MCVInHashTable_lookup(hashTable, arrayProbe[i]);
+
+ /*
+ * If found, obtain its MCV frequency and remember how many values
+ * on the hashed side map to this entry.
+ */
+ if (entry != NULL)
+ {
+ s1 = hash_mcv ? sslot.numbers[entry->index]
+ : sslot.numbers[i];
+
+ nvaluesmcv = entry->count;
+
+ /* Matched values are no longer considered non-MCV */
+ nvalues_non_mcv -= nvaluesmcv;
+ }
+ else
+ {
+ /* No MCV match for this value */
+ continue;
+ }
+
+ /*
+ * Fold this value's probability into the running ANY/ALL
+ * selectivity estimate once for each occurrence.
+ */
+ for (int j = 0; j < nvaluesmcv; j++)
+ accum_scalararray_prob(s1, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+ }
+
+ /*
+ * Account for constant IN-list values that did not match any MCV.
+ *
+ * Each such value is assumed to have probability = nonmcv_selec,
+ * derived from the remaining (non-MCV) probability mass.
+ */
+ for (int i = 0; i < nvalues_non_mcv; i++)
+ accum_scalararray_prob(nonmcv_selec, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+
+ /*
+ * Account for non-Const IN-list elements.
+ *
+ * These values cannot be matched against MCVs, so we rely on the
+ * operator's generic selectivity estimator for each of them.
+ */
+ for (int i = 0; i < nvalues_nonconst; i++)
+ {
+ Selectivity s1 = var_eq_non_const(vardata, operator, collation,
+ other_op, var_on_left, false);
+
+ accum_scalararray_prob(s1, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+ }
+
+ /*
+ * For = ANY or <> ALL, if the IN-list elements are assumed distinct,
+ * the events are disjoint and the total probability is the sum of
+ * individual probabilities. Use that estimate if it lies in [0,1].
+ */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+
+ CLAMP_PROBABILITY(selec);
+
+ MCVInHashTable_destroy(hashTable);
+ free_attstatsslot(&sslot);
+ }
+
+ return selec;
+}
+
+/*
+ * Accumulate the selectivity contribution of a single array element
+ * into the running ScalarArrayOpExpr selectivity estimate.
+ */
+static void
+accum_scalararray_prob(Selectivity s1, bool useOr, bool isEquality,
+ bool isInequality, double nullfrac,
+ double *selec, double *s1disjoint)
+{
+ Selectivity s2 = s1;
+
+ if (isInequality)
+ s2 = 1.0 - s2 - nullfrac;
+
+ CLAMP_PROBABILITY(s2);
+
+ if (useOr)
+ {
+ *selec = *selec + s2 - (*selec) * s2;
+ if (isEquality)
+ *s1disjoint += s2;
+ }
+ else
+ {
+ *selec = (*selec) * s2;
+ if (isInequality)
+ *s1disjoint += s2 - 1.0;
+ }
+}
+
+/*
+ * Support functions for the hash tables used by eqjoinsel_find_matches
+ */
+static uint32
+hash_mcv_in(MCVInHashTable_hash *tab, Datum key)
+{
+ MCVInHashContext *context = (MCVInHashContext *) tab->private_data;
+ FunctionCallInfo fcinfo = context->hash_fcinfo;
+ Datum fresult;
+
+ fcinfo->args[0].value = key;
+ fcinfo->isnull = false;
+ fresult = FunctionCallInvoke(fcinfo);
+ Assert(!fcinfo->isnull);
+ return DatumGetUInt32(fresult);
+}
+
+static bool
+mcvs_in_equal(MCVInHashTable_hash *tab, Datum key0, Datum key1)
+{
+ MCVInHashContext *context = (MCVInHashContext *) tab->private_data;
+
+ if (context->insert_mode)
+ {
+ /*
+ * During the insertion step, any comparisons will be between two
+ * Datums of the hash table's data type, so if the given operator is
+ * cross-type it will be the wrong thing to use. Fortunately, we can
+ * use datum_image_eq instead. The MCV values should all be distinct
+ * anyway, so it's mostly pro-forma to compare them at all.
+ */
+ return datum_image_eq(key0, key1,
+ context->hash_typbyval, context->hash_typlen);
+ }
+ else
+ {
+ FunctionCallInfo fcinfo = context->equal_fcinfo;
+ Datum fresult;
+
+ /*
+ * Apply the operator the correct way around. Although simplehash.h
+ * doesn't document this explicitly, during lookups key0 is from the
+ * hash table while key1 is the probe value, so we should compare them
+ * in that order only if op_is_reversed.
+ */
+ if (context->op_is_reversed)
+ {
+ fcinfo->args[0].value = key0;
+ fcinfo->args[1].value = key1;
+ }
+ else
+ {
+ fcinfo->args[0].value = key1;
+ fcinfo->args[1].value = key0;
+ }
+ fcinfo->isnull = false;
+ fresult = FunctionCallInvoke(fcinfo);
+ return (!fcinfo->isnull && DatumGetBool(fresult));
+ }
+}
+
/*
* Estimate number of elements in the array yielded by an expression.
*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 34374df0d67..235678b73c6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1671,6 +1671,9 @@ MBuf
MCVHashContext
MCVHashEntry
MCVHashTable_hash
+MCVInHashContext
+MCVInHashEntry
+MCVInHashTable_hash
MCVItem
MCVList
MEMORY_BASIC_INFORMATION
--
2.34.1
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-02-02 09:29 David Geier <[email protected]>
parent: Ilia Evdokimov <[email protected]>
0 siblings, 1 reply; 10+ messages in thread
From: David Geier @ 2026-02-02 09:29 UTC (permalink / raw)
To: Ilia Evdokimov <[email protected]>; Chengpeng Yan <[email protected]>; +Cc: [email protected] <[email protected]>
Hi!
> Attached v4 patch with above fixes.
Good progress!
I did another pass over the code, focusing on structure:
- MCVHasContext and MCVInHashContext are identical. MCVHashEntry and
MCVInHashEntry only differ by the count member. I would, as said before,
merge them and simply not use the count member for the join case.
- hash_mcv_in() and mcvs_in_equal() are identical to hash_mcv() and
mcvs_equal(). Let's remove the new functions and use the existing ones
instead, in the spirit of the previous point.
- The threshold constants are also identical. I would merge them into a
single, e.g. MCV_HASH_THRESHOLD, in the spirit of the previous two points.
- MCVHashTable_hash will then be interchangable with
MCVInHashTable_hash. So let's remove MCVInHashTable_hash, in the spirit
of the previous three points.
- Use palloc_array() instead of palloc() when allocating arrays.
- We can avoid allocating the all-true elem_const array by passing NULL
for elem_const to scalararray_mcv_hash_match(), and considering a NULL
pointer to mean "all elements are constant".
- The following comment got copy&pasted from eqsel_internal() twice. It
reads a little strange now because we're not punting here by immediately
returning like in eqsel_internal() but instead fallback to the original
code path. Maybe say instead "... falling back to default code path to
compute default selectivity" or something like that.
/*
* If expression is not variable = something or something =
* variable, then punt and return a default estimate.
*/
- The call to fmgr_info(opfuncoid, &eqproc) is currently under have_mcvs
but can be moved into the next if.
- elem_nulls and elem_const does have to be 0-initialized via palloc0().
All elements are set in the subsequent for-loop. I believe elem_values
also doesn't have to be 0-initialized via palloc0().
- Have you checked there there's test coverage for the special cases
(nvalues_non_mcv > 0, nvalues_nonconst > 0, IN contains NULL,
isEnequality==true, etc.)? If not let's add tests for these.
I'll do a 2nd iteration, focusing on correctness, once these comments
are addressed and I've got the SQL from you so that I can test the
corner cases manually.
--
David Geier
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-02-07 07:42 Tatsuya Kawata <[email protected]>
parent: David Geier <[email protected]>
0 siblings, 1 reply; 10+ messages in thread
From: Tatsuya Kawata @ 2026-02-07 07:42 UTC (permalink / raw)
To: David Geier <[email protected]>; +Cc: Ilia Evdokimov <[email protected]>; Chengpeng Yan <[email protected]>; [email protected] <[email protected]>
Hi,
Thank you for this patch.
I've been studying how PostgreSQL handles selectivity estimation, and this
optimization for large IN-lists looks very useful.
I ran some tests for the special cases David mentioned:
- NULL + ALL: correctly returns selectivity ≈ 0 (rows=1)
- isInequality: <> ALL estimates match NOT IN
- Cross-type: int = ANY(bigint[]) works correctly
- Duplicate values: IN (1,1,1,2,2,3) preserves existing behavior
I noticed a few minor points:
1. The comment in MCVInHashEntry struct contains a typo:
"number of occurrences od current value" -> "of"
2. The ALL + NULL early-return logic appears in two places (lines 2579-2591
and 2644-2656). I initially considered consolidating this by checking for
NULL elements before building the hash table, but realized this would add
an extra loop in the common case where there are no NULLs.
Perhaps a brief comment explaining why this check is duplicated (to
avoid the overhead of a separate NULL-scanning loop) would help future
readers understand the design choice?
3. Minor style suggestion: adding a brief SQL example in the header comment
(e.g., "WHERE x IN (1,2,3,...)" or "WHERE x = ANY(ARRAY[...])") might help
future readers quickly understand the use case.
Thanks again for working on this optimization. It's been very educational
to follow the discussion and understand how selectivity estimation works in
PostgreSQL.
Regards,
Tatsuya Kawata
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-02-18 12:48 Ilia Evdokimov <[email protected]>
parent: Tatsuya Kawata <[email protected]>
0 siblings, 2 replies; 10+ messages in thread
From: Ilia Evdokimov @ 2026-02-18 12:48 UTC (permalink / raw)
To: Tatsuya Kawata <[email protected]>; David Geier <[email protected]>; +Cc: Chengpeng Yan <[email protected]>; [email protected] <[email protected]>
I've fixed all the comments raised above and updated the v5 patch.
On 2/7/26 10:42, Tatsuya Kawata wrote:
> I initially considered consolidating this by checking for NULL
> elements before building the hash table, but realized this would add
> an extra loop in the common case where there are no NULLs.
Thanks for that suggestion. We can check for NULL elements without an
explicit loop by using memchr(), so there's no need for an additional
building of hash table. I'll update patch with it.
That said, I think it might be better to continue this small
optimization with NULL for constant arrays separately in another thread.
It's cleaner to split this work into smaller, focused changes rather
than mixing everything into single patch
If anything is still unclear in the code or insufficiently documented,
or if you have other suggestions, please do not hesitate to point them out.
--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
[text/x-patch] v5-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch (18.9K, 2-v5-0001-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch)
download | inline diff:
From 0db00c23fbadc76c2345d3d14d7972c85d407196 Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <[email protected]>
Date: Wed, 18 Feb 2026 15:19:00 +0300
Subject: [PATCH v5] Use hash-based MCV matching for ScalarArrayOpExpr
selectivity
When estimating selectivity for ScalarArrayOpExpr (IN / ANY / ALL) with
available MCV statistics, the planner currently matches IN-list elements
against the MCV array using nested loops. For large IN-lists and/or large
MCV lists this leads to O(N*M) planning-time behavior.
This patch adds a hash-based matching strategy, similar to the one used
in join selectivity estimation. When MCV statistics are available and the
operator supports hashing, the smaller of the two inputs (MCV list or
IN-list constant elements) is chosen as the hash table build side, and
the other side is scanned once, reducing complexity to O(N+M).
The hash-based path is restricted to equality and inequality operators
that use eqsel()/neqsel(), and is applied only when suitable hash
functions and MCV statistics are available.
---
src/backend/utils/adt/selfuncs.c | 523 ++++++++++++++++++++++++++++++-
1 file changed, 518 insertions(+), 5 deletions(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 29fec655593..509853642c1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -146,23 +146,27 @@
/*
* In production builds, switch to hash-based MCV matching when the lists are
* large enough to amortize hash setup cost. (This threshold is compared to
- * the sum of the lengths of the two MCV lists. This is simplistic but seems
+ * the sum of the lengths of the two lists. This is simplistic but seems
* to work well enough.) In debug builds, we use a smaller threshold so that
* the regression tests cover both paths well.
*/
#ifndef USE_ASSERT_CHECKING
-#define EQJOINSEL_MCV_HASH_THRESHOLD 200
+#define MCV_HASH_THRESHOLD 200
#else
-#define EQJOINSEL_MCV_HASH_THRESHOLD 20
+#define MCV_HASH_THRESHOLD 20
#endif
-/* Entries in the simplehash hash table used by eqjoinsel_find_matches */
+/*
+ * Entries in the simplehash hash table used by
+ * eqjoinsel_find_matches and scalararray_mcv_hash_match
+ */
typedef struct MCVHashEntry
{
Datum value; /* the value represented by this entry */
int index; /* its index in the relevant AttStatsSlot */
uint32 hash; /* hash code for the Datum */
char status; /* status code used by simplehash.h */
+ int count; /* number of occurrences of current value in */
} MCVHashEntry;
/* private_data for the simplehash hash table */
@@ -184,6 +188,14 @@ get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
+static double scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left, Datum *elem_values,
+ bool *elem_nulls, int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality);
+static void accum_scalararray_prob(Selectivity s1, bool useOr, bool isEquality,
+ bool isInequality, double nullfrac,
+ double *selec, double *s1disjoint);
static double eqjoinsel_inner(FmgrInfo *eqproc, Oid collation,
Oid hashLeft, Oid hashRight,
VariableStatData *vardata1, VariableStatData *vardata2,
@@ -2025,6 +2037,42 @@ scalararraysel(PlannerInfo *root,
elmlen, elmbyval, elmalign,
&elem_values, &elem_nulls, &num_elems);
+ /* For WHERE x NOT IN (NULL, ...) selectivity is always 0.0 */
+ if (!useOr && memchr(elem_nulls, true, num_elems) != NULL)
+ return (Selectivity) 0.0;
+
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching. We use hash-search only for eqsel() and
+ * neqsel().
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then fall back to default code path to compute
+ * default selectivity.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ bool *elem_const = NULL;
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator, clause->inputcollid, other_op, var_on_left,
+ elem_values, elem_nulls, num_elems, elem_const,
+ nominal_element_type, useOr, isEquality, isInequality);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* For generic operators, we assume the probability of success is
* independent for each array element. But for "= ANY" or "<> ALL",
@@ -2100,6 +2148,87 @@ scalararraysel(PlannerInfo *root,
get_typlenbyval(arrayexpr->element_typeid,
&elmlen, &elmbyval);
+ /*
+ * Try to calculate selectivity by hash-search O(N) instead of O(N^2)
+ * in case of MCV matching. We use hash-search only for eqsel() and
+ * neqsel().
+ */
+ if ((isEquality || isInequality) && !is_join_clause)
+ {
+ VariableStatData vardata;
+ Node *other_op = NULL;
+ bool var_on_left;
+
+ /*
+ * If expression is not variable = something or something =
+ * variable, then fall back to default code path to compute
+ * default selectivity.
+ */
+ if (get_restriction_variable(root, clause->args, varRelid,
+ &vardata, &other_op, &var_on_left))
+ {
+ int num_elems;
+ Datum *elem_values;
+ bool *elem_nulls;
+ bool *elem_const;
+ ListCell *lc;
+ int i;
+
+ num_elems = list_length(arrayexpr->elements);
+ elem_values = palloc0_array(Datum, num_elems);
+ elem_nulls = palloc0_array(bool, num_elems);
+ elem_const = palloc0_array(bool, num_elems);
+
+ /*
+ * Build arrays describing ARRAY[] elements: - elem_values:
+ * Datum value for Const elements - elem_nulls: whether
+ * element is NULL - elem_const: whether element is a Const
+ * node
+ */
+ i = 0;
+ foreach(lc, arrayexpr->elements)
+ {
+ Node *elem_value = (Node *) lfirst(lc);
+
+ if (IsA(elem_value, Const))
+ {
+ elem_values[i] = ((Const *) elem_value)->constvalue;
+ elem_nulls[i] = ((Const *) elem_value)->constisnull;
+ elem_const[i] = true;
+ }
+ else
+ {
+ elem_nulls[i] = false;
+ elem_const[i] = false;
+ }
+
+ if (!useOr && elem_nulls[i])
+ {
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ return (Selectivity) 0.0;
+ }
+
+ i++;
+ }
+
+ s1 = scalararray_mcv_hash_match(&vardata, operator, clause->inputcollid, other_op, var_on_left,
+ elem_values, elem_nulls, num_elems, elem_const,
+ nominal_element_type, useOr, isEquality, isInequality);
+
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
+ }
+ }
+
/*
* We use the assumption of disjoint probabilities here too, although
* the odds of equal array elements are rather higher if the elements
@@ -2210,6 +2339,390 @@ scalararraysel(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity of a ScalarArrayOpExpr (ANY/ALL) using MCV statistics
+ * with hash-based matching.
+ *
+ * This function follows the same probability model as the generic
+ * ScalarArrayOpExpr selectivity code (independent or disjoint probabilities
+ * for OR/AND combinations), but attempts to speed up matching between
+ * IN-list elements and the column's most-common-values (MCV) statistics by
+ * using hashing instead of nested loops.
+ *
+ * MCV statistics are used only to obtain per-value selectivities for
+ * constants that match MCV entries. All probabilities are combined using
+ * the standard ANY/ALL formulas, exactly as in the generic estimator.
+ *
+ * The function may return -1.0 to indicate that hash-based MCV estimation
+ * is not applicable (for example, missing statistics, unsupported operator,
+ * or unavailable hash functions), in which case the caller should fall back
+ * to the generic ScalarArrayOpExpr selectivity estimation.
+ *
+ * Inputs:
+ * vardata: statistics and metadata for the variable being estimated
+ * operator: equality or inequality operator to apply
+ * collation: OID of collation to use
+ * other_op: expression for the non-variable side of the comparison
+ * var_on_left: true if the variable is on the left side of the operator
+ * elem_values: array of IN-list element values
+ * elem_nulls: array indicating which IN-list elements are NULL
+ * elem_const: array indicating which IN-list elements are Const nodes.
+ * array is NULL if all elemnets is const.
+ * num_elems: number of IN-list elements
+ * nominal_element_type: type of IN-list elements
+ * useOr: true if elements are combined using OR semantics, false for AND
+ * isEquality: true if the operator behaves like equality
+ * isInequality: true if the operator behaves like inequality
+ *
+ * Result:
+ * Selectivity estimate in the range [0.0, 1.0], or -1.0 if no estimate
+ * could be produced by this function.
+ *
+ * Note:
+ * This function assumes that the operator’s selectivity behavior matches
+ * eqsel()/neqsel semantics. It must not be used for operators with custom
+ * or non-standard selectivity behavior.
+ */
+static double
+scalararray_mcv_hash_match(VariableStatData *vardata, Oid operator, Oid collation,
+ Node *other_op, bool var_on_left,
+ Datum *elem_values, bool *elem_nulls, int num_elems, bool *elem_const,
+ Oid nominal_element_type, bool useOr, bool isEquality,
+ bool isInequality)
+{
+ Form_pg_statistic stats;
+ AttStatsSlot sslot;
+ FmgrInfo eqproc;
+ double selec = -1.0,
+ s1disjoint,
+ nullfrac = 0.0;
+ Oid hashLeft = InvalidOid,
+ hashRight = InvalidOid,
+ opfuncoid;
+ bool have_mcvs = false;
+
+ /*
+ * If the variable is known to be unique, MCV statistics do not represent
+ * a meaningful frequency distribution, so skip MCV-based estimation.
+ */
+ if (vardata->isunique && vardata->rel && vardata->rel->tuples >= 1.0)
+ return -1.0;
+
+ /*
+ * For inequality (<>, ALL), we compute probabilities using the negated
+ * equality operator and later transform them as
+ *
+ * p(x <> c) = 1 - p(x = c) - nullfrac
+ */
+ if (isInequality)
+ {
+ operator = get_negator(operator);
+ if (!OidIsValid(operator))
+ return -1.0;
+ }
+
+ opfuncoid = get_opcode(operator);
+ memset(&sslot, 0, sizeof(sslot));
+
+ if (HeapTupleIsValid(vardata->statsTuple))
+ {
+ if (statistic_proc_security_check(vardata, opfuncoid))
+ have_mcvs = get_attstatsslot(&sslot, vardata->statsTuple,
+ STATISTIC_KIND_MCV, InvalidOid,
+ ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
+ }
+
+ if (have_mcvs)
+ {
+ /*
+ * If the MCV list and IN-list are large enough, and the operator
+ * supports hashing, attempt to use hash functions so that MCV–IN
+ * matching can be done in O(N+M) instead of O(N×M).
+ */
+ if (sslot.nvalues + num_elems >= MCV_HASH_THRESHOLD)
+ {
+ fmgr_info(opfuncoid, &eqproc);
+ (void) get_op_hash_functions(operator, &hashLeft, &hashRight);
+ }
+ }
+
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ /* Use a hash table to speed up the matching */
+ LOCAL_FCINFO(fcinfo, 2);
+ LOCAL_FCINFO(hash_fcinfo, 1);
+ MCVHashTable_hash *hashTable;
+ FmgrInfo hash_proc;
+ MCVHashContext hashContext;
+ double sumallcommon = 0.0,
+ nonmcv_selec = 0.0;
+ bool isdefault;
+ bool hash_mcv;
+ double otherdistinct;
+ Datum *arrayHash;
+ Datum *arrayProbe;
+ int nvaluesHash;
+ int nvaluesProbe;
+ int nonmcv_cnt = num_elems;
+ int nonconst_cnt = 0;
+
+ /* Grab the nullfrac for use below. */
+ stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+ nullfrac = stats->stanullfrac;
+
+ selec = s1disjoint = (useOr ? 0.0 : 1.0);
+
+ InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+ NULL, NULL);
+ fcinfo->args[0].isnull = false;
+ fcinfo->args[1].isnull = false;
+
+ for (int i = 0; i < sslot.nvalues; i++)
+ sumallcommon += sslot.numbers[i];
+
+ /*
+ * Compute the total probability mass of all non-MCV values. This is
+ * the part of the column distribution not covered by MCVs.
+ */
+ nonmcv_selec = 1.0 - sumallcommon - nullfrac;
+ CLAMP_PROBABILITY(nonmcv_selec);
+
+ /*
+ * Approximate the per-value probability of a non-MCV constant by
+ * dividing the remaining probability mass by the number of other
+ * distinct values.
+ */
+ otherdistinct = get_variable_numdistinct(vardata, &isdefault) - sslot.nnumbers;
+ if (otherdistinct > 1)
+ nonmcv_selec /= otherdistinct;
+
+ if (sslot.nnumbers > 0 && nonmcv_selec > sslot.numbers[sslot.nnumbers - 1])
+ nonmcv_selec = sslot.numbers[sslot.nnumbers - 1];
+
+ /* Make sure we build the hash table on the smaller array. */
+ if (sslot.nvalues <= num_elems)
+ {
+ hash_mcv = true;
+ nvaluesHash = sslot.nvalues;
+ nvaluesProbe = num_elems;
+ arrayHash = sslot.values;
+ arrayProbe = elem_values;
+ }
+ else
+ {
+ hash_mcv = false;
+ nvaluesHash = num_elems;
+ nvaluesProbe = sslot.nvalues;
+ arrayHash = elem_values;
+ arrayProbe = sslot.values;
+ }
+
+ fmgr_info(hash_mcv ? hashLeft : hashRight, &hash_proc);
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+
+ hashContext.equal_fcinfo = fcinfo;
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.op_is_reversed = !hash_mcv;
+ hashContext.insert_mode = true;
+
+ get_typlenbyval(hash_mcv ? sslot.valuetype : nominal_element_type,
+ &hashContext.hash_typlen,
+ &hashContext.hash_typbyval);
+
+ hashTable = MCVHashTable_create(CurrentMemoryContext,
+ nvaluesHash,
+ &hashContext);
+
+ /* Build a hash table over the smaller input side. */
+ for (int i = 0; i < nvaluesHash; i++)
+ {
+ bool found = false;
+ MCVHashEntry *entry;
+
+ /*
+ * When hashing IN-list values (hash_mcv == false), we only insert
+ * constant, non-NULL elements. NULL and non-Const elements are
+ * counted separately, because they cannot participate in MCV
+ * matching and must be handled later using generic selectivity
+ * estimation.
+ */
+ if (!hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ nonmcv_cnt--;
+ continue;
+ }
+
+ if (elem_const != NULL && !elem_const[i])
+ {
+ nonmcv_cnt--;
+ nonconst_cnt++;
+ continue;
+ }
+ }
+
+ entry = MCVHashTable_insert(hashTable, arrayHash[i], &found);
+
+ /*
+ * entry->count tracks how many times the same value appears, so
+ * that duplicate IN-list elements can be folded into the
+ * probability calculation.
+ */
+ if (likely(!found))
+ {
+ entry->index = i;
+ entry->count = 1;
+ }
+ else
+ entry->count++;
+ }
+
+ hashContext.insert_mode = false;
+ if (hashLeft != hashRight)
+ {
+ fmgr_info(hash_mcv ? hashRight : hashLeft, &hash_proc);
+ /* Resetting hash_fcinfo is probably unnecessary, but be safe */
+ InitFunctionCallInfoData(*hash_fcinfo, &hash_proc, 1, collation,
+ NULL, NULL);
+ hash_fcinfo->args[0].isnull = false;
+ }
+
+ for (int i = 0; i < nvaluesProbe; i++)
+ {
+ MCVHashEntry *entry;
+ Selectivity s1;
+ int nvaluesmcv;
+
+ /*
+ * When probing with IN-list elements, ignore NULLs and non-Const
+ * expressions: they cannot be matched against MCVs and will be
+ * accounted for later by generic estimation.
+ */
+ if (hash_mcv)
+ {
+ if (elem_nulls[i])
+ {
+ nonmcv_cnt--;
+ continue;
+ }
+
+ if (elem_const != NULL && !elem_const[i])
+ {
+ nonmcv_cnt--;
+ nonconst_cnt++;
+ continue;
+ }
+ }
+
+ entry = MCVHashTable_lookup(hashTable, arrayProbe[i]);
+
+ /*
+ * If found, obtain its MCV frequency and remember how many values
+ * on the hashed side map to this entry.
+ */
+ if (entry != NULL)
+ {
+ s1 = hash_mcv ? sslot.numbers[entry->index]
+ : sslot.numbers[i];
+
+ nvaluesmcv = entry->count;
+
+ /* Matched values are no longer considered non-MCV */
+ nonmcv_cnt -= nvaluesmcv;
+ }
+ else
+ {
+ /* No MCV match for this value */
+ continue;
+ }
+
+ /*
+ * Fold this value's probability into the running ANY/ALL
+ * selectivity estimate once for each occurrence.
+ */
+ for (int j = 0; j < nvaluesmcv; j++)
+ accum_scalararray_prob(s1, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+ }
+
+ /*
+ * Account for constant IN-list values that did not match any MCV.
+ *
+ * Each such value is assumed to have probability = nonmcv_selec,
+ * derived from the remaining (non-MCV) probability mass.
+ */
+ for (int i = 0; i < nonmcv_cnt; i++)
+ accum_scalararray_prob(nonmcv_selec, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+
+ /*
+ * Account for non-Const IN-list elements.
+ *
+ * These values cannot be matched against MCVs, so we rely on the
+ * operator's generic selectivity estimator for each of them.
+ */
+ if (nonconst_cnt > 0)
+ {
+ Selectivity s1 = var_eq_non_const(vardata, operator, collation,
+ other_op, var_on_left, false);
+
+ for (int i = 0; i < nonconst_cnt; i++)
+ accum_scalararray_prob(s1, useOr, isEquality, isInequality,
+ nullfrac, &selec, &s1disjoint);
+ }
+
+ /*
+ * For = ANY or <> ALL, if the IN-list elements are assumed distinct,
+ * the events are disjoint and the total probability is the sum of
+ * individual probabilities. Use that estimate if it lies in [0,1].
+ */
+ if ((useOr ? isEquality : isInequality) &&
+ s1disjoint >= 0.0 && s1disjoint <= 1.0)
+ selec = s1disjoint;
+
+ CLAMP_PROBABILITY(selec);
+
+ MCVHashTable_destroy(hashTable);
+ free_attstatsslot(&sslot);
+ }
+
+ return selec;
+}
+
+/*
+ * Accumulate the selectivity contribution of a single array element
+ * into the running ScalarArrayOpExpr selectivity estimate.
+ */
+static void
+accum_scalararray_prob(Selectivity s1, bool useOr, bool isEquality,
+ bool isInequality, double nullfrac,
+ double *selec, double *s1disjoint)
+{
+ Selectivity s2 = s1;
+
+ if (isInequality)
+ s2 = 1.0 - s2 - nullfrac;
+
+ CLAMP_PROBABILITY(s2);
+
+ if (useOr)
+ {
+ *selec = *selec + s2 - (*selec) * s2;
+ if (isEquality)
+ *s1disjoint += s2;
+ }
+ else
+ {
+ *selec = (*selec) * s2;
+ if (isInequality)
+ *s1disjoint += s2 - 1.0;
+ }
+}
+
/*
* Estimate number of elements in the array yielded by an expression.
*
@@ -2446,7 +2959,7 @@ eqjoinsel(PG_FUNCTION_ARGS)
* If the MCV lists are long enough to justify hashing, try to look up
* hash functions for the join operator.
*/
- if ((sslot1.nvalues + sslot2.nvalues) >= EQJOINSEL_MCV_HASH_THRESHOLD)
+ if ((sslot1.nvalues + sslot2.nvalues) >= MCV_HASH_THRESHOLD)
(void) get_op_hash_functions(operator, &hashLeft, &hashRight);
}
else
--
2.34.1
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-02-24 19:32 Matheus Alcantara <[email protected]>
parent: Ilia Evdokimov <[email protected]>
1 sibling, 0 replies; 10+ messages in thread
From: Matheus Alcantara @ 2026-02-24 19:32 UTC (permalink / raw)
To: Ilia Evdokimov <[email protected]>; Tatsuya Kawata <[email protected]>; David Geier <[email protected]>; +Cc: Chengpeng Yan <[email protected]>; [email protected] <[email protected]>
Hi, thanks for working on this!
On Wed Feb 18, 2026 at 9:48 AM -03, Ilia Evdokimov wrote:
> I've fixed all the comments raised above and updated the v5 patch.
>
Here are some comments regarding v5 patch:
On scalararraysel() we have:
+ ReleaseVariableStats(vardata);
+
+ if (s1 >= 0.0)
+ return s1;
I'm wondering if we also should call ReleaseVariableStats() on the early
return?
+ if (!useOr && elem_nulls[i])
+ {
+ pfree(elem_values);
+ pfree(elem_nulls);
+ pfree(elem_const);
+
+ return (Selectivity) 0.0;
+ }
------------------
On scalararray_mcv_hash_match() free_attstatsslot() is called only on
if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight)),
perhaps it should be moved outside the if condition?
+ if (have_mcvs && OidIsValid(hashLeft) && OidIsValid(hashRight))
+ {
+ ...
+ MCVHashTable_destroy(hashTable);
+ free_attstatsslot(&sslot);
+ }
+
+ return selec;
------------------
typo: "all elements are const"
+ * array is NULL if all elemnets is const.
------------------
It's worth adding on scalararray_mcv_hash_match() an early return when
num_elems == 0? I imagine that this can happens, e.g "WHERE a =
ANY(array[]::int[]);". In this case the function should still execute
completely?
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: Hash-based MCV matching for large IN-lists
@ 2026-02-25 21:39 Zsolt Parragi <[email protected]>
parent: Ilia Evdokimov <[email protected]>
1 sibling, 0 replies; 10+ messages in thread
From: Zsolt Parragi @ 2026-02-25 21:39 UTC (permalink / raw)
To: Ilia Evdokimov <[email protected]>; +Cc: Tatsuya Kawata <[email protected]>; David Geier <[email protected]>; Chengpeng Yan <[email protected]>; [email protected] <[email protected]>
Hello
+ hashContext.hash_fcinfo = hash_fcinfo;
+ hashContext.op_is_reversed = !hash_mcv;
+ hashContext.insert_mode = true;
Are you sure about op_is_reversed, isn't it backwards, shouldn't it be
= hash_mcv instead?
See the following testcase:
CREATE TABLE test_cross_type_bug (val float4);
INSERT INTO test_cross_type_bug
SELECT v
FROM generate_series(1, 200) AS v,
generate_series(1, 50);
ALTER TABLE test_cross_type_bug ALTER COLUMN val SET STATISTICS 200;
ANALYZE test_cross_type_bug;
SELECT string_agg(v::text, ', ') AS in_list
FROM generate_series(1, 200) AS gs(v) \gset
EXPLAIN SELECT * FROM test_cross_type_bug
WHERE val = ANY(ARRAY[:in_list]::float4[]);
EXPLAIN SELECT * FROM test_cross_type_bug
WHERE val = ANY(ARRAY[:in_list]::float8[]);
DROP TABLE test_cross_type_bug;
^ permalink raw reply [nested|flat] 10+ messages in thread
end of thread, other threads:[~2026-02-25 21:39 UTC | newest]
Thread overview: 10+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-01-14 10:19 Re: Hash-based MCV matching for large IN-lists Ilia Evdokimov <[email protected]>
2026-01-19 14:01 ` David Geier <[email protected]>
2026-01-27 15:43 ` Ilia Evdokimov <[email protected]>
2026-01-28 13:08 ` Chengpeng Yan <[email protected]>
2026-01-29 11:14 ` Ilia Evdokimov <[email protected]>
2026-02-02 09:29 ` David Geier <[email protected]>
2026-02-07 07:42 ` Tatsuya Kawata <[email protected]>
2026-02-18 12:48 ` Ilia Evdokimov <[email protected]>
2026-02-24 19:32 ` Matheus Alcantara <[email protected]>
2026-02-25 21:39 ` Zsolt Parragi <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox