Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qQuEx-001M7i-JX for pgsql-admin@arkaria.postgresql.org; Tue, 01 Aug 2023 18:37:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1qQuEv-009GvL-MX for pgsql-admin@arkaria.postgresql.org; Tue, 01 Aug 2023 18:37:03 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qQuEv-009GvC-0w for pgsql-admin@lists.postgresql.org; Tue, 01 Aug 2023 18:37:02 +0000 Received: from mail-oo1-xc31.google.com ([2607:f8b0:4864:20::c31]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1qQuEs-000EKD-W0 for pgsql-admin@lists.postgresql.org; Tue, 01 Aug 2023 18:37:01 +0000 Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-56cc461f34fso1601645eaf.0 for ; Tue, 01 Aug 2023 11:36:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690915018; x=1691519818; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=WYkeoN9BXiD5WgYRW0MdlqXaXB5JIZP5+kBJRnBcVhY=; b=FJX3NICYKd2/cCr2Lhici46Ia13TfDZWNxlftVnVIP+KZQR9aI3eizRwN2XVUui80+ nbQhGVf6aAYFir6kYYjA4oBqDzE7/c5MB/z0gw7FjZTZpVuaym7x9SB+7F/E9VDKSjUA djHADyz8YkGQmucg3Uh/UIDZkef5VHpfGuUerx4MnBVjbi0zf1H+LPAulR+8+AR6qCmu zJlNOm9FTcwCB9UcWDBnYKXpGIYi3g2kztgxma4Ogk99jHLIcflArd+DS2JeI1MTymV8 +Wnh0nHDNcQv0As3b4RSkr4CNGekg+0mzI+JRCoT0q9BMkENLUbqwLYFTlLi3kVLabXH KOfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690915018; x=1691519818; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WYkeoN9BXiD5WgYRW0MdlqXaXB5JIZP5+kBJRnBcVhY=; b=M/T+6q6nap//SJYRZG7tFLDCjyGMHtfKGsRke6MhnzEs511V/NJmlihiZVYOl4qqgk xP/tpG1xuZ4oFt8lY5U4/bPkYgwtuezN5bozjRofHl6dCAdyQVgxskqLB72lfvMy49aO stWYAmUdLvHGhdw8Q+S+gYdwYl7gjURrv0i+6VAY2XPkIixaecfXZM0UIg3Qho9s5oyF Q2LcJCLUq4olhbTgcyFaTJmBheEZpXabtjFykNZOwTjmCHdgUdyGt+Yzs8Pr4tynJFUx phY2hWEPdannZTlnOzXFICo2IODk9wuYOPFETGsuU/VDL7jETZU/hTd4G45wCTH4SB2p fy5w== X-Gm-Message-State: ABy/qLa3rzUp9denAcvQ0nkzCvPd+GQRCQk1akLfH39vga3ZyBgc1ItA CQMdPPDdJysenX+uitPW3IoV3HkXvgwYHOFkmiU= X-Google-Smtp-Source: APBJJlE6/2wP2RIUivZhiiidQhgroTQb/B/+Z0lBzcisN4VrG387wudhdIcDT8mVkbTCYfFMczN/kM15wPIbqBExDNA= X-Received: by 2002:a4a:6f19:0:b0:566:f763:8fb7 with SMTP id h25-20020a4a6f19000000b00566f7638fb7mr10604424ooc.2.1690915018237; Tue, 01 Aug 2023 11:36:58 -0700 (PDT) MIME-Version: 1.0 References: <08735195c2ef42beb3cda68bee761752@express-scripts.com> In-Reply-To: <08735195c2ef42beb3cda68bee761752@express-scripts.com> From: richard coleman Date: Tue, 1 Aug 2023 14:36:47 -0400 Message-ID: Subject: Re: analyze foreign tables To: "Wetmore, Matthew (CTR)" Cc: Laurenz Albe , Pgsql-admin Content-Type: multipart/alternative; boundary="00000000000001f8e00601e0d559" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000001f8e00601e0d559 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Mathew, Thanks but I think you might have misunderstood my concern. I am talking about foreign tables, autovacuum explicitly doesn't work on foreign tables. rik. On Tue, Aug 1, 2023 at 1:51=E2=80=AFPM Wetmore, Matthew (CTR) < Matthew.Wetmore@express-scripts.com> wrote: > I just do it like this per table. Might not solve your exact issue, but > another option. You can scale down the analyze factor to something very > small like 0.00000001 > > > > -- Find current setting (this is at database level) > > select * from pg_settings where name in > ('autovacuum','autovacuum_analyze_scale_factor','autovacuum_analyze_thres= hold','autovacuum_vacuum_scale_factor'); > > select current_setting('autovacuum_vacuum_scale_factor') as > "analyze_scale_factor",current_setting('autovacuum_vacuum_threshold') as > "vacuum_threshold"; > > select current_setting('autovacuum_analyze_scale_factor') as > "analyze_scale_factor", current_setting('autovacuum_analyze_threshold') a= s > "analyze_threshold"; > > -- Note: The smaller number =3D more aggressive =3D vacuum more frequence > > -- Current: > > -- autovacuum_analyze_scale_factor =3D 0.05 ---> 0.002 > > -- autovacuum_vacuum_scale_factor =3D 0.1 ---> 0.001 > > -- Fine Tune at table level =3D ALTER TABLE mytable SET > (autovacuum_analyze_scale_factor =3D 0.02); > > ALTER TABLE your_schema.your_table SET (autovacuum_enabled =3D > true,autovacuum_analyze_scale_factor =3D 0.002,autovacuum_vacuum_scale_fa= ctor > =3D 0.001); > > -- Put it back to use global setting > > ALTER TABLE your_schema.your_table RESET > (autovacuum_enabled,autovacuum_analyze_scale_factor,autovacuum_vacuum_sca= le_factor); > > > > *From:* richard coleman > *Sent:* Tuesday, August 1, 2023 9:36 AM > *To:* Laurenz Albe > *Cc:* Pgsql-admin > *Subject:* [EXTERNAL] Re: analyze foreign tables > > > > Laurenz, > > > > Thanks for taking the time to respond. > > > > Right now I'm stuck with cronning a script to manually run analyze on > every foreign table in every database, which in our case is most of them. > > > > Would it be possible to transfer table statistics between the same versio= n > of PostgreSQL, ex: source is pg15, target is pg15? > > > > Otherwise, anything that can be done to speed this up would be very > helpful. > > > > Thanks again, > > rik. > > > > > > On Tue, Aug 1, 2023 at 12:16=E2=80=AFPM Laurenz Albe > wrote: > > On Tue, 2023-08-01 at 09:47 -0400, richard coleman wrote: > > In PostgreSQL foreign tables are not automatically analyzed and analyze > must be > > specifically called on each table. In the case of sharing tables betwe= en > > PostgreSQL clusters, there is the use_remote_estimate foreign server > option. > > > > In some of our multi terabyte databases, manually running analyze on al= l > of > > the foreign tables can take more than a day. This is per database > containing > > the foreign schema. Since we have certain large schema that we have > centrally > > located and share to all of our other database clusters, this really > adds up. > > > > use_remote_estimate isn't really a solution as it adds way too much > overhead > > and processing time to every query run. > > > > Since these tables are being continuously analyzed in the database that > hosts > > the data, is there some way that they statistics could be easily passed > through > > the foreign server mechanism to the remote database that's calling the > query? > > > > Unless I'm missing something we can either: > > 1. manually run analyze on each foreign table in each database that > points to > > the host table > > 2. set use_remote_estimate =3D true which will cause PostgreSQL to > re-obtain > > statistics on a per query basis. > > > > What I am hoping for is either: > > 1. pass through the results of analyze from the source database to the > one > > where the foreign query is being run > > 2. add the ability to automatically run analyze on foreign tables just > as they > > are currently run on local tables. > > > > Of the two, #1 would seem to be the easiest and least wasteful of > resources. > > Unfortunately, both your wishes don't look feasible: > > - Transferring table statistics would mean that PostgreSQL understands > statistics > from other server versions. This is complicated, and we have decided > not to > do this for pg_upgrade, so I don't think we'll try to do it here. > > - Autoanalyzing foreign tables would mean that we have some idea how much > data > has changed on the remote server. How should we do that? > > What I can imagine is that instead of reading the complete remote table > during > ANALYZE, PostgreSQL applies TABLESAMPLE to fetch only part. That could b= e > a > workable enhancement. > > Yours, > Laurenz Albe > > --00000000000001f8e00601e0d559 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Mathew,=C2=A0

Thanks but I think you mi= ght have misunderstood my concern.=C2=A0 I am talking about foreign tables,= autovacuum explicitly doesn't work on foreign tables.

rik.

On Tue, Aug 1, 2023 at 1:51=E2=80=AFPM Wetmore, Matthew = (CTR) <Matthew.We= tmore@express-scripts.com> wrote:

I just do it like this per table.=C2=A0 Migh= t not solve your exact issue, but another option. You can scale down the an= alyze factor to something very small like 0.00000001

=C2=A0

-- Find current setting (this is at database= level)

select * from pg_settings=C2=A0 where name i= n ('autovacuum','autovacuum_analyze_scale_factor','auto= vacuum_analyze_threshold','autovacuum_vacuum_scale_factor');=

select current_setting('autovacuum_vacuu= m_scale_factor') as "analyze_scale_factor",current_setting(&#= 39;autovacuum_vacuum_threshold') as "vacuum_threshold";

select current_setting('autovacuum_analy= ze_scale_factor') as "analyze_scale_factor", current_setting(= 'autovacuum_analyze_threshold') as "analyze_threshold";

-- Note: The smaller number =3D more aggress= ive =3D vacuum more frequence

-- Current:

-- autovacuum_analyze_scale_factor =3D 0.05= =C2=A0=C2=A0=C2=A0=C2=A0 ---> 0.002

-- autovacuum_vacuum_scale_factor =3D 0.1=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ---> 0.001

-- Fine Tune at table level =3D ALTER TABLE = mytable SET (autovacuum_analyze_scale_factor =3D 0.02);

ALTER TABLE your_schema.your_table SET (auto= vacuum_enabled =3D true,autovacuum_analyze_scale_factor =3D 0.002,autovacuu= m_vacuum_scale_factor =3D 0.001);

-- Put it back to use global setting<= u>

ALTER TABLE your_schema.your_table RESET (au= tovacuum_enabled,autovacuum_analyze_scale_factor,autovacuum_vacuum_scale_fa= ctor);

=C2=A0

From: richard coleman <rcoleman.ascentgl@gmail.com>
Sent: Tuesday, August 1, 2023 9:36 AM
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: Pgsql-admin <pgsql-admin@lists.postgresql.org>
Subject: [EXTERNAL] Re: analyze foreign tables
<= /p>

=C2=A0

Laurenz,=C2=A0

=C2=A0

Thanks for taking the time to respond.=

=C2=A0

Right now I'm stuck with cronning a script to ma= nually run analyze on every foreign table in every database, which in our c= ase is most of them.

=C2=A0

Would it be possible to transfer table statistics be= tween the same version of PostgreSQL, ex: source is pg15, target is pg15?

=C2=A0

Otherwise, anything that can be done to speed this u= p would be very helpful.

=C2=A0

Thanks again,=C2=A0

rik.

=C2=A0

=C2=A0

On Tue, Aug 1, 2023 at 12:16=E2=80=AFPM Laurenz Albe= <laurenz.= albe@cybertec.at> wrote:

On Tue, 2023-08-01 at 09:47 -0400, richard coleman w= rote:
> In PostgreSQL foreign tables are not automatically analyzed and analyz= e must be
> specifically called on each table.=C2=A0 In the case of sharing tables= between
> PostgreSQL clusters, there is the=C2=A0use_remote_estimate foreign ser= ver option.
>
> In some of our multi terabyte databases, manually running analyze on a= ll of
> the foreign tables can take more than a day.=C2=A0 This is per databas= e containing
> the foreign schema.=C2=A0 Since we have certain large schema that we h= ave centrally
> located and share to all of our other database clusters, this really a= dds up.
>
> use_remote_estimate isn't really a solution as it adds way too muc= h overhead
> and processing time to every query run.
>
> Since these tables are being continuously analyzed in the database tha= t hosts
> the data, is there some way that they statistics could be easily passe= d through
> the foreign server mechanism to the remote database that's calling= the query?
>
> Unless I'm missing something we can either:
> 1. manually run analyze on each foreign table in each database that po= ints to
>=C2=A0 =C2=A0 the host table
> 2. set use_remote_estimate =3D true which will cause PostgreSQL to re-= obtain
>=C2=A0 =C2=A0 statistics on a per query basis.
>
> What I am hoping for is either:
> 1. pass through the results of analyze from the source database to the= one
>=C2=A0 =C2=A0 where the foreign query is being run
> 2. add the ability to automatically run analyze on foreign tables just= as they
>=C2=A0 =C2=A0 are currently run on local tables.
>
> Of the two, #1 would seem to be the easiest and least wasteful of reso= urces.

Unfortunately, both your wishes don't look feasible:

- Transferring table statistics would mean that PostgreSQL understands stat= istics
=C2=A0 from other server versions.=C2=A0 This is complicated, and we have d= ecided not to
=C2=A0 do this for pg_upgrade, so I don't think we'll try to do it = here.

- Autoanalyzing foreign tables would mean that we have some idea how much d= ata
=C2=A0 has changed on the remote server.=C2=A0 How should we do that?

What I can imagine is that instead of reading the complete remote table dur= ing
ANALYZE, PostgreSQL applies TABLESAMPLE to fetch only part.=C2=A0 That coul= d be a
workable enhancement.

Yours,
Laurenz Albe

--00000000000001f8e00601e0d559--