public inbox for [email protected]help / color / mirror / Atom feed
pgsql: Use extended stats for precise estimation of bucket size in hash 3+ messages / 3 participants [nested] [flat]
* pgsql: Use extended stats for precise estimation of bucket size in hash @ 2025-03-10 11:47 Alexander Korotkov <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: Alexander Korotkov @ 2025-03-10 11:47 UTC (permalink / raw) To: [email protected] Use extended stats for precise estimation of bucket size in hash join Recognizing the real-life complexity where columns in the table often have functional dependencies, PostgreSQL's estimation of the number of distinct values over a set of columns can be underestimated (or much rarely, overestimated) when dealing with multi-clause JOIN. In the case of hash join, it can end up with a small number of predicted hash buckets and, as a result, picking non-optimal merge join. To improve the situation, we introduce one additional stage of bucket size estimation - having two or more join clauses estimator lookup for extended statistics and use it for multicolumn estimation. Clauses are grouped into lists, each containing expressions referencing the same relation. The result of the multicolumn estimation made over such a list is combined with others according to the caller's logic. Clauses that are not estimated are returned to the caller for further estimation. Discussion: https://postgr.es/m/52257607-57f6-850d-399a-ec33a654457b%40postgrespro.ru Author: Andrei Lepikhov <[email protected]> Reviewed-by: Andy Fan <[email protected]> Reviewed-by: Tomas Vondra <[email protected]> Reviewed-by: Alena Rybakina <[email protected]> Reviewed-by: Alexander Korotkov <[email protected]> Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/6bb6a62f3cc45624c601d5270673a17447734629 Modified Files -------------- src/backend/optimizer/path/costsize.c | 12 ++- src/backend/utils/adt/selfuncs.c | 175 ++++++++++++++++++++++++++++++++ src/include/utils/selfuncs.h | 4 + src/test/regress/expected/stats_ext.out | 45 ++++++++ src/test/regress/sql/stats_ext.sql | 29 ++++++ 5 files changed, 264 insertions(+), 1 deletion(-) ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: pgsql: Use extended stats for precise estimation of bucket size in hash @ 2025-04-10 00:37 Robins Tharakan <[email protected]> parent: Alexander Korotkov <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: Robins Tharakan @ 2025-04-10 00:37 UTC (permalink / raw) To: Alexander Korotkov <[email protected]>; +Cc: [email protected] Hi, On Mon, 10 Mar 2025 at 22:18, Alexander Korotkov <[email protected]> wrote: > > Use extended stats for precise estimation of bucket size in hash join After this commit, I see a recurrence of an error similar to the one fixed in e28033fe1af8037e0fec8bb3a32fabbe18ac06b1 https://www.postgresql.org/message-id/18885-da51324078588253%40postgresql.org - robins https://robins.in ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: pgsql: Use extended stats for precise estimation of bucket size in hash @ 2025-04-10 08:23 Alexander Korotkov <[email protected]> parent: Robins Tharakan <[email protected]> 0 siblings, 0 replies; 3+ messages in thread From: Alexander Korotkov @ 2025-04-10 08:23 UTC (permalink / raw) To: Robins Tharakan <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; [email protected] On Thu, Apr 10, 2025 at 3:37 AM Robins Tharakan <[email protected]> wrote: > On Mon, 10 Mar 2025 at 22:18, Alexander Korotkov <[email protected]> wrote: > > > > Use extended stats for precise estimation of bucket size in hash join > > > After this commit, I see a recurrence of an error similar to the one fixed in e28033fe1af8037e0fec8bb3a32fabbe18ac06b1 > > https://www.postgresql.org/message-id/18885-da51324078588253%40postgresql.org Thank you, I'm looking into that. ------ Regards, Alexander Korotkov Supabase ^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2025-04-10 08:23 UTC | newest] Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-03-10 11:47 pgsql: Use extended stats for precise estimation of bucket size in hash Alexander Korotkov <[email protected]> 2025-04-10 00:37 ` Robins Tharakan <[email protected]> 2025-04-10 08:23 ` Alexander Korotkov <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox