public inbox for [email protected]  
help / color / mirror / Atom feed
From: Wetmore, Matthew (CTR) <[email protected]>
To: richard coleman <[email protected]>
To: Laurenz Albe <[email protected]>
Cc: Pgsql-admin <[email protected]>
Subject: analyze foreign tables
Date: Tue, 1 Aug 2023 17:51:56 +0000
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAGA3vBspwtKvaFBXT4c78cdW+mgfXJ68MSDL3TiVdMxpZPp_VA@mail.gmail.com>
References: <CAGA3vBt726ha1P91DrSOF8oc57X27tLO8o9Wi9oDN5L3qqRi7Q@mail.gmail.com>
	<[email protected]>
	<CAGA3vBspwtKvaFBXT4c78cdW+mgfXJ68MSDL3TiVdMxpZPp_VA@mail.gmail.com>

I just do it like this per table.  Might not solve your exact issue, but another option. You can scale down the analyze factor to something very small like 0.00000001

-- Find current setting (this is at database level)
select * from pg_settings  where name in ('autovacuum','autovacuum_analyze_scale_factor','autovacuum_analyze_threshold','autovacuum_vacuum_scale_factor');
select current_setting('autovacuum_vacuum_scale_factor') as "analyze_scale_factor",current_setting('autovacuum_vacuum_threshold') as "vacuum_threshold";
select current_setting('autovacuum_analyze_scale_factor') as "analyze_scale_factor", current_setting('autovacuum_analyze_threshold') as "analyze_threshold";
-- Note: The smaller number = more aggressive = vacuum more frequence
-- Current:
-- autovacuum_analyze_scale_factor = 0.05     ---> 0.002
-- autovacuum_vacuum_scale_factor = 0.1       ---> 0.001
-- Fine Tune at table level = ALTER TABLE mytable SET (autovacuum_analyze_scale_factor = 0.02);
ALTER TABLE your_schema.your_table SET (autovacuum_enabled = true,autovacuum_analyze_scale_factor = 0.002,autovacuum_vacuum_scale_factor = 0.001);
-- Put it back to use global setting
ALTER TABLE your_schema.your_table RESET (autovacuum_enabled,autovacuum_analyze_scale_factor,autovacuum_vacuum_scale_factor);

From: richard coleman <[email protected]>
Sent: Tuesday, August 1, 2023 9:36 AM
To: Laurenz Albe <[email protected]>
Cc: Pgsql-admin <[email protected]>
Subject: [EXTERNAL] Re: analyze foreign tables

Laurenz,

Thanks for taking the time to respond.

Right now I'm stuck with cronning a script to manually run analyze on every foreign table in every database, which in our case is most of them.

Would it be possible to transfer table statistics between the same version of PostgreSQL, ex: source is pg15, target is pg15?

Otherwise, anything that can be done to speed this up would be very helpful.

Thanks again,
rik.


On Tue, Aug 1, 2023 at 12:16 PM Laurenz Albe <[email protected]<mailto:[email protected]>> wrote:
On Tue, 2023-08-01 at 09:47 -0400, richard coleman wrote:
> In PostgreSQL foreign tables are not automatically analyzed and analyze must be
> specifically called on each table.  In the case of sharing tables between
> PostgreSQL clusters, there is the use_remote_estimate foreign server option.
>
> In some of our multi terabyte databases, manually running analyze on all of
> the foreign tables can take more than a day.  This is per database containing
> the foreign schema.  Since we have certain large schema that we have centrally
> located and share to all of our other database clusters, this really adds up.
>
> use_remote_estimate isn't really a solution as it adds way too much overhead
> and processing time to every query run.
>
> Since these tables are being continuously analyzed in the database that hosts
> the data, is there some way that they statistics could be easily passed through
> the foreign server mechanism to the remote database that's calling the query?
>
> Unless I'm missing something we can either:
> 1. manually run analyze on each foreign table in each database that points to
>    the host table
> 2. set use_remote_estimate = true which will cause PostgreSQL to re-obtain
>    statistics on a per query basis.
>
> What I am hoping for is either:
> 1. pass through the results of analyze from the source database to the one
>    where the foreign query is being run
> 2. add the ability to automatically run analyze on foreign tables just as they
>    are currently run on local tables.
>
> Of the two, #1 would seem to be the easiest and least wasteful of resources.

Unfortunately, both your wishes don't look feasible:

- Transferring table statistics would mean that PostgreSQL understands statistics
  from other server versions.  This is complicated, and we have decided not to
  do this for pg_upgrade, so I don't think we'll try to do it here.

- Autoanalyzing foreign tables would mean that we have some idea how much data
  has changed on the remote server.  How should we do that?

What I can imagine is that instead of reading the complete remote table during
ANALYZE, PostgreSQL applies TABLESAMPLE to fetch only part.  That could be a
workable enhancement.

Yours,
Laurenz Albe


view thread (15+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: analyze foreign tables
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox