public inbox for [email protected]  
help / color / mirror / Atom feed
From: Laurenz Albe <[email protected]>
To: richard coleman <[email protected]>
To: Pgsql-admin <[email protected]>
Subject: Re: analyze foreign tables
Date: Tue, 01 Aug 2023 18:16:01 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAGA3vBt726ha1P91DrSOF8oc57X27tLO8o9Wi9oDN5L3qqRi7Q@mail.gmail.com>
References: <CAGA3vBt726ha1P91DrSOF8oc57X27tLO8o9Wi9oDN5L3qqRi7Q@mail.gmail.com>

On Tue, 2023-08-01 at 09:47 -0400, richard coleman wrote:
> In PostgreSQL foreign tables are not automatically analyzed and analyze must be
> specifically called on each table.  In the case of sharing tables between
> PostgreSQL clusters, there is the use_remote_estimate foreign server option.
> 
> In some of our multi terabyte databases, manually running analyze on all of
> the foreign tables can take more than a day.  This is per database containing
> the foreign schema.  Since we have certain large schema that we have centrally
> located and share to all of our other database clusters, this really adds up.
> 
> use_remote_estimate isn't really a solution as it adds way too much overhead
> and processing time to every query run.
> 
> Since these tables are being continuously analyzed in the database that hosts
> the data, is there some way that they statistics could be easily passed through
> the foreign server mechanism to the remote database that's calling the query?
> 
> Unless I'm missing something we can either:
> 1. manually run analyze on each foreign table in each database that points to
>    the host table
> 2. set use_remote_estimate = true which will cause PostgreSQL to re-obtain
>    statistics on a per query basis.
> 
> What I am hoping for is either:
> 1. pass through the results of analyze from the source database to the one
>    where the foreign query is being run
> 2. add the ability to automatically run analyze on foreign tables just as they
>    are currently run on local tables.
> 
> Of the two, #1 would seem to be the easiest and least wasteful of resources.

Unfortunately, both your wishes don't look feasible:

- Transferring table statistics would mean that PostgreSQL understands statistics
  from other server versions.  This is complicated, and we have decided not to
  do this for pg_upgrade, so I don't think we'll try to do it here.

- Autoanalyzing foreign tables would mean that we have some idea how much data
  has changed on the remote server.  How should we do that?

What I can imagine is that instead of reading the complete remote table during
ANALYZE, PostgreSQL applies TABLESAMPLE to fetch only part.  That could be a
workable enhancement.

Yours,
Laurenz Albe





view thread (15+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: analyze foreign tables
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox