public inbox for [email protected]
help / color / mirror / Atom feedFrom: Laurenz Albe <[email protected]>
To: richard coleman <[email protected]>
To: Pgsql-admin <[email protected]>
Subject: Re: analyze foreign tables
Date: Tue, 01 Aug 2023 18:16:01 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAGA3vBt726ha1P91DrSOF8oc57X27tLO8o9Wi9oDN5L3qqRi7Q@mail.gmail.com>
References: <CAGA3vBt726ha1P91DrSOF8oc57X27tLO8o9Wi9oDN5L3qqRi7Q@mail.gmail.com>
On Tue, 2023-08-01 at 09:47 -0400, richard coleman wrote:
> In PostgreSQL foreign tables are not automatically analyzed and analyze must be
> specifically called on each table. In the case of sharing tables between
> PostgreSQL clusters, there is the use_remote_estimate foreign server option.
>
> In some of our multi terabyte databases, manually running analyze on all of
> the foreign tables can take more than a day. This is per database containing
> the foreign schema. Since we have certain large schema that we have centrally
> located and share to all of our other database clusters, this really adds up.
>
> use_remote_estimate isn't really a solution as it adds way too much overhead
> and processing time to every query run.
>
> Since these tables are being continuously analyzed in the database that hosts
> the data, is there some way that they statistics could be easily passed through
> the foreign server mechanism to the remote database that's calling the query?
>
> Unless I'm missing something we can either:
> 1. manually run analyze on each foreign table in each database that points to
> the host table
> 2. set use_remote_estimate = true which will cause PostgreSQL to re-obtain
> statistics on a per query basis.
>
> What I am hoping for is either:
> 1. pass through the results of analyze from the source database to the one
> where the foreign query is being run
> 2. add the ability to automatically run analyze on foreign tables just as they
> are currently run on local tables.
>
> Of the two, #1 would seem to be the easiest and least wasteful of resources.
Unfortunately, both your wishes don't look feasible:
- Transferring table statistics would mean that PostgreSQL understands statistics
from other server versions. This is complicated, and we have decided not to
do this for pg_upgrade, so I don't think we'll try to do it here.
- Autoanalyzing foreign tables would mean that we have some idea how much data
has changed on the remote server. How should we do that?
What I can imagine is that instead of reading the complete remote table during
ANALYZE, PostgreSQL applies TABLESAMPLE to fetch only part. That could be a
workable enhancement.
Yours,
Laurenz Albe
view thread (15+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: analyze foreign tables
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox