public inbox for [email protected]
help / color / mirror / Atom feedFrom: Laurenz Albe <[email protected]>
To: Ron Johnson <[email protected]>
To: pgsql-admin <[email protected]>
Subject: Re: pg_stats.correlation rule of thumb for re-clustering a table?
Date: Fri, 12 Sep 2025 22:59:07 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CANzqJaCEvOm_zY7Ex_vN3r5cQLzWx9DxYd3hV=pwPmm4wsJGWQ@mail.gmail.com>
References: <CANzqJaCEvOm_zY7Ex_vN3r5cQLzWx9DxYd3hV=pwPmm4wsJGWQ@mail.gmail.com>
On Fri, 2025-09-12 at 10:46 -0400, Ron Johnson wrote:
> Purely OLTP tables (that are only accessed randomly) can of course live with 0% correlation,
> but lots of tables are mixed-use, and so benefit from physical ordering on a carefully chosen field..
>
> SELECT abs(correlation)::numeric(3,2) as correlation
> FROM pg_stats
> WHERE schemaname = 'foo' AND tablename = 'bar'
> AND attname = 'blarge';
> correlation
> -------------
> 0.84
> (1 row)
>
> Obviously 84% is no need to worry, but what about 60% or 40%? Currently, I use 60%, but would like to do better.
Either the difference is gradual, so that it there is no real cut-off point,
or there is a sudden plan change at some point that depends on the query the
data and the parameter settings. I don't think it is possible to give reliable
numbers that cover all cases.
I suggest that you run a series of benchmarks with a copy of the table with
different correlation values and come up with numbers that are meaningful
for your individual case.
Yours,
Laurenz Albe
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: pg_stats.correlation rule of thumb for re-clustering a table?
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox