MIME-Version: 1.0
From: =?UTF-8?Q?Enrique_S=C3=A1nchez?= <enriqueesanchz@gmail.com>
Date: Mon, 18 May 2026 18:09:03 +0200
Message-ID: 
 <CAOCkzAkmnAjEWxfP-vBTRta7cHe5CMu8i4SFSeOi2nS8+Kj7Ng@mail.gmail.com>
Subject: Extended statistics improvement: multi-column MCV missing values
To: pgsql-hackers@lists.postgresql.org
Content-Type: multipart/alternative; boundary="000000000000a99865065219c8cc"
Archived-At: 
 <https://www.postgresql.org/message-id/CAOCkzAkmnAjEWxfP-vBTRta7cHe5CMu8i4SFSeOi2nS8%2BKj7Ng%40mail.gmail.com>
Precedence: bulk

--000000000000a99865065219c8cc
Content-Type: text/plain; charset="UTF-8"

Hi,

Postgres only uses multi-column MCVs when the value we are looking for is
in the list. If not, it falls back into individual independent statistics
to estimate selectivity.
However, a miss in a multi-column MCV list still yields valuable
information that it currently throws away: we know that the combination's
frequency is strictly bounded by the frequency of the last (least common)
item in that MCV list.

Test case
=========
```
-- 1. Create a minimal table
DROP TABLE IF EXISTS planner_trap;
CREATE TABLE planner_trap (
    col_a integer,
    col_b integer,
    col_c integer,
    sort_col integer
);

-- 2. 1 becomes the most common value (MCV) for all three columns.
INSERT INTO planner_trap (col_a, col_b, col_c, sort_col)
SELECT
    (pow(random(), 5) * 100)::integer + 1,
    (pow(random(), 5) * 10)::integer + 1,
    (pow(random(), 5) * 50)::integer + 1,
    i
FROM generate_series(1, 100000) AS i;

-- 3. Create indexes
CREATE INDEX idx_planner_trap_sort ON planner_trap(sort_col);
CREATE INDEX idx_planner_trap_a_b_c ON planner_trap(col_a, col_b, col_c);

-- 4. Make the exact combination (1, 1, 1) yield zero rows. This ensures it
won't appear
-- in the MCV list, even though the value '1' remains the most common for
each individual column.
UPDATE planner_trap
SET col_a = 2
WHERE col_a = 1 AND col_b = 1 AND col_c = 1;

-- 5. Create MCV statistics
CREATE STATISTICS (mcv) on col_a, col_b, col_c FROM planner_trap;
ANALYZE planner_trap;
```

The planner multiplies the individual selectivities and overestimates the
row count (5909; actual=0):
```
postgres=# EXPLAIN ANALYZE SELECT * FROM planner_trap WHERE col_a = 1 AND
col_b = 1 AND col_c = 1 ORDER BY sort_col DESC LIMIT 10;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..7.72 rows=10 width=16) (actual time=14.632..14.633
rows=0.00 loops=1)
   Buffers: shared hit=16355 dirtied=1
   ->  Index Scan Backward using idx_planner_trap_sort on planner_trap
 (cost=0.29..4386.99 rows=5909 width=16) (actual time=14.630..14.630
rows=0.00 loops=1)
         Filter: ((col_a = 1) AND (col_b = 1) AND (col_c = 1))
         Rows Removed by Filter: 100000
         Index Searches: 1
         Buffers: shared hit=16355 dirtied=1
 Planning:
   Buffers: shared hit=33 read=1
 Planning Time: 0.478 ms
 Execution Time: 14.651 ms
```

1. Cap selectivity with the last MCV item's frequency
=====================================================
Applying the last MCV cap here, Postgres estimates 117 rows: 0.001167 *
100000 = 117
```
postgres=# SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on
(oid = stxoid),
                pg_mcv_list_items(stxdmcv) m WHERE stxname =
'planner_trap_col_a_col_b_col_c_stat' and index = 99;
 index |  values  |  nulls  |       frequency       | base_frequency
-------+----------+---------+-----------------------+----------------
    99 | {1,1,31} | {f,f,f} | 0.0011666666666666668 | 0.001049409536
```

making postgres to choose a better plan:
```
 Limit  (cost=300.72..300.75 rows=10 width=16) (actual time=0.045..0.046
rows=0.00 loops=1)
   Buffers: shared hit=3 read=2
   ->  Sort  (cost=300.72..301.02 rows=117 width=16) (actual
time=0.043..0.044 rows=0.00 loops=1)
         Sort Key: sort_col DESC
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=3 read=2
         ->  Bitmap Heap Scan on planner_trap  (cost=5.78..298.19 rows=117
width=16) (actual time=0.026..0.027 rows=0.00 loops=1)
               Recheck Cond: ((col_a = 1) AND (col_b = 1) AND (col_c = 1))
               Buffers: shared read=2
               ->  Bitmap Index Scan on idx_planner_trap_a_b_c
 (cost=0.00..5.76 rows=117 width=0) (actual time=0.019..0.020 rows=0.00
loops=1)
                     Index Cond: ((col_a = 1) AND (col_b = 1) AND (col_c =
1))
                     Index Searches: 1
                     Buffers: shared read=2
 Planning:
   Buffers: shared hit=51 read=8 dirtied=4
 Planning Time: 0.742 ms
 Execution Time: 0.094 ms
```

2. Estimate selectivity as Postgres does for single-column values not in
MCVs
=============================================================================
While that significantly improves estimations, we could mirror what
Postgres already does for individual MCVs. Quote from the official
documentation:
> The approach is to use the fact that the value is not in the list,
combined with the knowledge of the frequencies for all of the MCVs:
> That is, add up all the frequencies for the MCVs and subtract them from
one, then divide by the number of other distinct values.

To achieve this, we need to store an ndistinct estimation alongside the
MCVs that can be used for partial or entire column match.

P(1, 1, 1) = (1 - sum(MCVs)) / (ndistinct(col_a, col_b, col_c) -
MCV_list_size)

```
postgres=# SELECT sum(m.frequency) FROM pg_statistic_ext join
pg_statistic_ext_data on (oid = stxoid),
                pg_mcv_list_items(stxdmcv) m WHERE stxname =
'planner_trap_col_a_col_b_col_c_stat';
         sum
---------------------
 0.39456666666666645

postgres=# SELECT stxkeys AS k, stxdndistinct AS nd
  FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid)
  WHERE stxname = 'stts2';
   k   |                      nd

-------+---------------------------------------------------
 1 2 3 | [...{"attributes": [1, 2, 3], "ndistinct": 8511}]
```

Then (1 - 0.39456666666666645) / (8511 - 100) = 0.000071981; 0.000071981 *
100000 = 7 rows
```
 Limit  (cost=30.30..30.32 rows=7 width=16) (actual time=0.035..0.036
rows=0.00 loops=1)
   Buffers: shared hit=2
   ->  Sort  (cost=30.30..30.32 rows=7 width=16) (actual time=0.033..0.034
rows=0.00 loops=1)
         Sort Key: sort_col DESC
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=2
         ->  Bitmap Heap Scan on planner_trap  (cost=4.38..30.20 rows=7
width=16) (actual time=0.028..0.029 rows=0.00 loops=1)
               Recheck Cond: ((col_a = 1) AND (col_b = 1) AND (col_c = 1))
               Buffers: shared hit=2
               ->  Bitmap Index Scan on idx_planner_trap_a_b_c
 (cost=0.00..4.38 rows=7 width=0) (actual time=0.022..0.022 rows=0.00
loops=1)
                     Index Cond: ((col_a = 1) AND (col_b = 1) AND (col_c =
1))
                     Index Searches: 1
                     Buffers: shared hit=2
 Planning Time: 0.192 ms
 Execution Time: 0.057 ms
```

I think this is a cheap way to prevent bad estimations. The storage
overhead of adding an ndistinct field is trivial compared to the MCV list
itself, and the O(1) arithmetic during planning adds no measurable
overhead. I look forward to your feedback before drafting a patch.

Best,
Enrique.

--000000000000a99865065219c8cc
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><span style=3D"font-family:monospace">Hi,<br><br>Postgres =
only uses multi-column MCVs when the value we are looking for is in the lis=
t. If not, it falls back into individual independent statistics to estimate=
 selectivity.<br>However, a miss in a multi-column MCV list still yields va=
luable information that it currently throws away: we know that the combinat=
ion&#39;s frequency is strictly bounded by the frequency of the last (least=
 common) item in that MCV list.<br><br>Test case<br>=3D=3D=3D=3D=3D=3D=3D=
=3D=3D<br>```<br>-- 1. Create a minimal table<br>DROP TABLE IF EXISTS plann=
er_trap;<br>CREATE TABLE planner_trap (<br>=C2=A0 =C2=A0 col_a integer,<br>=
=C2=A0 =C2=A0 col_b integer,<br>=C2=A0 =C2=A0 col_c integer,<br>=C2=A0 =C2=
=A0 sort_col integer<br>);<br><br>-- 2. 1 becomes the most common value (MC=
V) for all three columns.<br>INSERT INTO planner_trap (col_a, col_b, col_c,=
 sort_col)<br>SELECT <br>=C2=A0 =C2=A0 (pow(random(), 5) * 100)::integer + =
1,<br>=C2=A0 =C2=A0 (pow(random(), 5) * 10)::integer + 1,<br>=C2=A0 =C2=A0 =
(pow(random(), 5) * 50)::integer + 1,<br>=C2=A0 =C2=A0 i<br>FROM generate_s=
eries(1, 100000) AS i;<br><br>-- 3. Create indexes<br>CREATE INDEX idx_plan=
ner_trap_sort ON planner_trap(sort_col);<br>CREATE INDEX idx_planner_trap_a=
_b_c ON planner_trap(col_a, col_b, col_c);<br><br>-- 4. Make the exact comb=
ination (1, 1, 1) yield zero rows. This ensures it won&#39;t appear<br>-- i=
n the MCV list, even though the value &#39;1&#39; remains the most common f=
or each individual column.<br>UPDATE planner_trap <br>SET col_a =3D 2 <br>W=
HERE col_a =3D 1 AND col_b =3D 1 AND col_c =3D 1;<br><br>-- 5. Create MCV s=
tatistics<br>CREATE STATISTICS (mcv) on col_a, col_b, col_c FROM planner_tr=
ap;<br>ANALYZE planner_trap;<br>```<br><br>The planner multiplies the indiv=
idual selectivities and overestimates the row count (5909; actual=3D0):<br>=
```<br>postgres=3D# EXPLAIN ANALYZE SELECT * FROM planner_trap WHERE col_a =
=3D 1 AND col_b =3D 1 AND col_c =3D 1 ORDER BY sort_col DESC LIMIT 10;<br>=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 QUERY PLAN =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 <br>---------------------------------------------------------=
---------------------------------------------------------------------------=
---------------------------<br>=C2=A0Limit =C2=A0(cost=3D0.29..7.72 rows=3D=
10 width=3D16) (actual time=3D14.632..14.633 rows=3D0.00 loops=3D1)<br>=C2=
=A0 =C2=A0Buffers: shared hit=3D16355 dirtied=3D1<br>=C2=A0 =C2=A0-&gt; =C2=
=A0Index Scan Backward using idx_planner_trap_sort on planner_trap =C2=A0(c=
ost=3D0.29..4386.99 rows=3D5909 width=3D16) (actual time=3D14.630..14.630 r=
ows=3D0.00 loops=3D1)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Filter: ((col_a =
=3D 1) AND (col_b =3D 1) AND (col_c =3D 1))<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0Rows Removed by Filter: 100000<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0I=
ndex Searches: 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Buffers: shared hit=
=3D16355 dirtied=3D1<br>=C2=A0Planning:<br>=C2=A0 =C2=A0Buffers: shared hit=
=3D33 read=3D1<br>=C2=A0Planning Time: 0.478 ms<br>=C2=A0Execution Time: 14=
.651 ms<br>```<br><br>1. Cap selectivity with the last MCV item&#39;s frequ=
ency<br>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D<br>Applying the last MCV cap here, Postgres estimates 11=
7 rows: 0.001167 * 100000 =3D 117<br>```<br>postgres=3D# SELECT m.* FROM pg=
_statistic_ext join pg_statistic_ext_data on (oid =3D stxoid),<br>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pg_mcv_list_items(stxdmcv)=
 m WHERE stxname =3D &#39;planner_trap_col_a_col_b_col_c_stat&#39; and inde=
x =3D 99;<br>=C2=A0index | =C2=A0values =C2=A0| =C2=A0nulls =C2=A0| =C2=A0 =
=C2=A0 =C2=A0 frequency =C2=A0 =C2=A0 =C2=A0 | base_frequency <br>-------+-=
---------+---------+-----------------------+----------------<br>=C2=A0 =C2=
=A0 99 | {1,1,31} | {f,f,f} | 0.0011666666666666668 | 0.001049409536<br>```=
<br><br>making postgres to choose a better plan:<br>```<br>=C2=A0Limit =C2=
=A0(cost=3D300.72..300.75 rows=3D10 width=3D16) (actual time=3D0.045..0.046=
 rows=3D0.00 loops=3D1)<br>=C2=A0 =C2=A0Buffers: shared hit=3D3 read=3D2<br=
>=C2=A0 =C2=A0-&gt; =C2=A0Sort =C2=A0(cost=3D300.72..301.02 rows=3D117 widt=
h=3D16) (actual time=3D0.043..0.044 rows=3D0.00 loops=3D1)<br>=C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0Sort Key: sort_col DESC<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0Sort Method: quicksort =C2=A0Memory: 25kB<br>=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0Buffers: shared hit=3D3 read=3D2<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0-&gt; =C2=A0Bitmap Heap Scan on planner_trap =C2=A0(cost=3D5.78..=
298.19 rows=3D117 width=3D16) (actual time=3D0.026..0.027 rows=3D0.00 loops=
=3D1)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Recheck Con=
d: ((col_a =3D 1) AND (col_b =3D 1) AND (col_c =3D 1))<br>=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Buffers: shared read=3D2<br>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-&gt; =C2=A0Bitmap Index Sc=
an on idx_planner_trap_a_b_c =C2=A0(cost=3D0.00..5.76 rows=3D117 width=3D0)=
 (actual time=3D0.019..0.020 rows=3D0.00 loops=3D1)<br>=C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Index Cond: ((col_a=
 =3D 1) AND (col_b =3D 1) AND (col_c =3D 1))<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Index Searches: 1<br>=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Bu=
ffers: shared read=3D2<br>=C2=A0Planning:<br>=C2=A0 =C2=A0Buffers: shared h=
it=3D51 read=3D8 dirtied=3D4<br>=C2=A0Planning Time: 0.742 ms<br>=C2=A0Exec=
ution Time: 0.094 ms<br>```<br><br>2. Estimate selectivity as Postgres does=
 for single-column values not in MCVs<br>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>While that significantl=
y improves estimations, we could mirror what Postgres already does for indi=
vidual MCVs. Quote from the official documentation:<br>&gt; The approach is=
 to use the fact that the value is not in the list, combined with the knowl=
edge of the frequencies for all of the MCVs:<br>&gt; That is, add up all th=
e frequencies for the MCVs and subtract them from one, then divide by the n=
umber of other distinct values.<br><br>To achieve this, we need to store an=
 ndistinct estimation alongside the MCVs that can be used for partial or en=
tire column match.<br><br>P(1, 1, 1) =3D (1 - sum(MCVs)) / (ndistinct(col_a=
, col_b, col_c) - MCV_list_size)<br><br>```<br>postgres=3D# SELECT sum(m.fr=
equency) FROM pg_statistic_ext join pg_statistic_ext_data on (oid =3D stxoi=
d),<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pg_mcv_list_=
items(stxdmcv) m WHERE stxname =3D &#39;planner_trap_col_a_col_b_col_c_stat=
&#39;;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sum =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 <br>---------------------<br>=C2=A00.39456666666666645<br>=C2=A0<br>postgr=
es=3D# SELECT stxkeys AS k, stxdndistinct AS nd<br>=C2=A0 FROM pg_statistic=
_ext join pg_statistic_ext_data on (oid =3D stxoid)<br>=C2=A0 WHERE stxname=
 =3D &#39;stts2&#39;;<br>=C2=A0 =C2=A0k =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0nd =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <br>-------+---=
------------------------------------------------<br>=C2=A01 2 3 | [...{&quo=
t;attributes&quot;: [1, 2, 3], &quot;ndistinct&quot;: 8511}]<br>```<br><br>=
Then (1 - 0.39456666666666645) / (8511 - 100) =3D 0.000071981; 0.000071981 =
* 100000 =3D 7 rows<br>```<br>=C2=A0Limit =C2=A0(cost=3D30.30..30.32 rows=
=3D7 width=3D16) (actual time=3D0.035..0.036 rows=3D0.00 loops=3D1)<br>=C2=
=A0 =C2=A0Buffers: shared hit=3D2<br>=C2=A0 =C2=A0-&gt; =C2=A0Sort =C2=A0(c=
ost=3D30.30..30.32 rows=3D7 width=3D16) (actual time=3D0.033..0.034 rows=3D=
0.00 loops=3D1)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Sort Key: sort_col DES=
C<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Sort Method: quicksort =C2=A0Memory:=
 25kB<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Buffers: shared hit=3D2<br>=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-&gt; =C2=A0Bitmap Heap Scan on planner_trap=
 =C2=A0(cost=3D4.38..30.20 rows=3D7 width=3D16) (actual time=3D0.028..0.029=
 rows=3D0.00 loops=3D1)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0Recheck Cond: ((col_a =3D 1) AND (col_b =3D 1) AND (col_c =3D 1))<br=
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Buffers: shared hit=
=3D2<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-&gt; =C2=A0=
Bitmap Index Scan on idx_planner_trap_a_b_c =C2=A0(cost=3D0.00..4.38 rows=
=3D7 width=3D0) (actual time=3D0.022..0.022 rows=3D0.00 loops=3D1)<br>=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0In=
dex Cond: ((col_a =3D 1) AND (col_b =3D 1) AND (col_c =3D 1))<br>=C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Index Sea=
rches: 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0Buffers: shared hit=3D2<br>=C2=A0Planning Time: 0.192 ms<br>=
=C2=A0Execution Time: 0.057 ms<br>```<br><br>I think this is a cheap way to=
 prevent bad estimations. The storage overhead of adding an ndistinct field=
 is trivial compared to the MCV list itself, and the O(1) arithmetic during=
 planning adds no measurable overhead. I look forward to your feedback befo=
re drafting a patch.<br><br>Best,<br>Enrique.</span></div>

--000000000000a99865065219c8cc--