Histogram question.

public inbox for [email protected]  
help / color / mirror / Atom feed

Histogram question.
2+ messages / 2 participants
[nested] [flat]

* Histogram question.
@ 2022-04-05 14:34 Jian He <[email protected]>
  2022-04-05 15:14 ` Re: Histogram question. Steve Midgley <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: Jian He @ 2022-04-05 14:34 UTC (permalink / raw)
  To: pgsql-sql <[email protected]>

Queries in PostgreSQL: 2. Statistics : Postgres Professional
<https://postgrespro.com/blog/pgsql/5969296;



SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::
text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v
WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no';

*return 0.6762. *

SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::
text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v
WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no' AND v >
'30C';

return *0.2127*

SELECT round( reltuples * ( 0.2127 -- from most common values + (1 - 0.6762
- 0) * (49 / 100.0) -- from histogram )) FROM pg_class WHERE relname =
'boarding_passes';

the above mentioned query, the part I don't understand is *49/100.*


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Histogram question.
  2022-04-05 14:34 Histogram question. Jian He <[email protected]>
@ 2022-04-05 15:14 ` Steve Midgley <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Steve Midgley @ 2022-04-05 15:14 UTC (permalink / raw)
  To: Jian He <[email protected]>; +Cc: pgsql-sql <[email protected]>

On Tue, Apr 5, 2022 at 7:35 AM Jian He <[email protected]> wrote:

> Queries in PostgreSQL: 2. Statistics : Postgres Professional
> <https://postgrespro.com/blog/pgsql/5969296;
>
>
>
> SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::
> text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v
> WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no';
>
> *return 0.6762. *
>
> SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::
> text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v
> WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no' AND v >
> '30C';
>
> return *0.2127*
>
> SELECT round( reltuples * ( 0.2127 -- from most common values + (1 -
> 0.6762 - 0) * (49 / 100.0) -- from histogram )) FROM pg_class WHERE
> relname = 'boarding_passes';
>
> the above mentioned query, the part I don't understand is *49/100.*
>
>
I believe the exercise is intended to create a set of histograms based on
data values over a series of intervals. The 49/100 (if I'm reading the
source material correctly) refers to finding all the boarding passes in the
lower 49 of 100 intervals. I didn't bother to read what the interval
definition is, but I think that's what the "49" is referring to..


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

end of thread, other threads:[~2022-04-05 15:14 UTC | newest]

Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2022-04-05 14:34 Histogram question. Jian He <[email protected]>
2022-04-05 15:14 ` Steve Midgley <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox