public inbox for [email protected]
help / color / mirror / Atom feedFrom: jian he <[email protected]>
To: vignesh C <[email protected]>
Cc: Schoemans Maxime <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Damir Belyalov <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: SAKR Mahmoud <[email protected]>
Cc: Diogo Repas <[email protected]>
Cc: LUO Zhicheng <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Andrey Lepikhov <[email protected]>
Subject: Re: Implement missing join selectivity estimation for range types
Date: Mon, 22 Jan 2024 16:10:37 +0800
Message-ID: <CACJufxFhES7untAPmxJDrB-y_pNgah_1HNbXjcOj6g=yxh3kdA@mail.gmail.com> (raw)
In-Reply-To: <CALDaNm3A4j7=rmY128Eb3v8zi=hHAnGizUcbSX5ekZKHy-wmMA@mail.gmail.com>
References: <CAB4o4asMq3k6HN9WfDsssQ5DDVfAziB4TpiFJ8RBJgZTVuwC7g@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAB4o4aud47V_iRyWtA8+ZAmdXDjCF165R73AeCjx2RL0nzQzHA@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAB4o4asvPN=NT7KvS9zVQjZbdsiRW5t8aQctEkW7mxc4hbBxVQ@mail.gmail.com>
<[email protected]>
<[email protected]>
<[email protected]>
<CALH1Lgurg1y1DTeFOXOkpP=+X7saVGCh8gSjDLoCBOcFFWhz-A@mail.gmail.com>
<[email protected]>
<[email protected]>
<[email protected]>
<CALDaNm29NMhpPvoM49qqfYzVcRJq0CPTd-xSUcTF7RZQO2TASQ@mail.gmail.com>
<[email protected]>
<CALDaNm3A4j7=rmY128Eb3v8zi=hHAnGizUcbSX5ekZKHy-wmMA@mail.gmail.com>
I cannot figure out why it aborts.
as Tom mentioned in upthread about the test cases.
similar to src/test/regress/sql/stats_ext.sql check_estimated_rows function.
we can test it by something:
create or replace function check_estimated_rows(text) returns table (ok bool)
language plpgsql as
$$
declare
ln text;
tmp text[];
first_row bool := true;
begin
for ln in
execute format('explain analyze %s', $1)
loop
if first_row then
first_row := false;
tmp := regexp_match(ln, 'rows=(\d*) .* rows=(\d*)');
return query select 0.2 < tmp[1]::float8 / tmp[2]::float8
and tmp[1]::float8 / tmp[2]::float8 < 5;
end if;
end loop;
end;
$$;
select * from check_estimated_rows($$select * from test_range_join_1,
test_range_join_2 where ir1 && ir2$$);
select * from check_estimated_rows($$select * from test_range_join_1,
test_range_join_2 where ir1 << ir2$$);
select * from check_estimated_rows($$select * from test_range_join_1,
test_range_join_2 where ir1 >> ir2$$);
Do you need 3 tables to do the test? because we need to actually run
the query then compare the estimated row
and actually returned rows.
If you really execute the query with 3 table joins, it will take a lot of time.
So two tables join with where quql should be fine?
/* Fast-forwards i and j to start of iteration */
+ for (i = 0; range_cmp_bound_values(typcache, &hist1[i], &hist2[0]) < 0; i++);
+ for (j = 0; range_cmp_bound_values(typcache, &hist2[j], &hist1[0]) < 0; j++);
+
+ /* Do the estimation on overlapping regions */
+ while (i < nhist1 && j < nhist2)
+ {
+ double cur_sel1,
+ cur_sel2;
+ RangeBound cur_sync;
+
+ if (range_cmp_bound_values(typcache, &hist1[i], &hist2[j]) < 0)
+ cur_sync = hist1[i++];
+ else if (range_cmp_bound_values(typcache, &hist1[i], &hist2[j]) > 0)
+ cur_sync = hist2[j++];
+ else
+ {
+ /* If equal, skip one */
+ cur_sync = hist1[i];
+
this part range_cmp_bound_values "if else if" part computed twice, you
can just do
`
int cmp;
cmp = range_cmp_bound_values(typcache, &hist1[i], &hist2[j]);
if cmp <0 then
else if cmp > 0 then
else then
`
also. I think you can put the following into main while loop.
+ for (i = 0; range_cmp_bound_values(typcache, &hist1[i], &hist2[0]) < 0; i++);
+ for (j = 0; range_cmp_bound_values(typcache, &hist2[j], &hist1[0]) < 0; j++);
split range and multirange into 2 patches might be a good idea.
seems: same function (calc_hist_join_selectivity) with same function
signature in src/backend/utils/adt/multirangetypes_selfuncs.c
and src/backend/utils/adt/rangetypes_selfuncs.c,
previously mail complaints not resolved.
view thread (16+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Implement missing join selectivity estimation for range types
In-Reply-To: <CACJufxFhES7untAPmxJDrB-y_pNgah_1HNbXjcOj6g=yxh3kdA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox