public inbox for [email protected]
help / color / mirror / Atom feedFrom: Asad Ali <[email protected]>
To: Ilia Evdokimov <[email protected]>
Cc: [email protected]
Subject: Re: Duplicate Extended Statistics
Date: Wed, 4 Sep 2024 13:28:45 +0500
Message-ID: <CAJ9xe=u0FykfAi-ihsjVRkZ3XiJLbW97+BYhEnFo1_3wcps8AA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
Hi Ilia,
In PostgreSQL, it is possible to create duplicate extended statistics
because the system does not enforce uniqueness on statistics definitions.
However, this is generally not recommended, as it leads to longer ANALYZE
times, increased storage usage, potential planner performance impact, and
unnecessary complexity. In practice, duplicates are rare because users and
tools usually avoid redundancy, as there is no added benefit to having
multiple identical sets of statistics on the same columns.
Regards,
Asad Ali
On Tue, Sep 3, 2024 at 6:10 PM Ilia Evdokimov <[email protected]>
wrote:
> Hello everyone,
>
> I have a question regarding extended statistics in PostgreSQL. Why is it
> possible to create duplicate extended statistics? To make it clearer,
> here’s an example:
>
> CREATE TABLE t(a int, b int);
> INSERT INTO t(a, b) VALUES (...);
> CREATE STATISTICS ON a, b FROM t;
> ANALYZE t;
> ....
> CREATE STATISTICS ON a, b FROM t;
> ANALYZE t;
>
> After executing these queries, the following issues might arise:
>
> 1. ANALYZE will take longer to run because, for example, MCV extended
> statistics would need to be gathered twice.
> 2. Duplicate information will be stored.
> 3. The planner might take longer to find the relevant statistics since
> it has to search through them in a loop.
>
> Or do duplicate extended statistics practically never occur in practice?
>
> Thanks in advance for your response.
>
> --
> Regards,
> Ilia Evdokimov,
> Tantor Labs LCC.
>
>
view thread (4+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Duplicate Extended Statistics
In-Reply-To: <CAJ9xe=u0FykfAi-ihsjVRkZ3XiJLbW97+BYhEnFo1_3wcps8AA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox