public inbox for [email protected]help / color / mirror / Atom feed
Re: analyze-in-stages post upgrade questions 3+ messages / 3 participants [nested] [flat]
* Re: analyze-in-stages post upgrade questions @ 2025-06-27 15:31 Adrian Klaver <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: Adrian Klaver @ 2025-06-27 15:31 UTC (permalink / raw) To: Zechman, Derek S <[email protected]>; [email protected] <[email protected]> On 6/27/25 06:35, Zechman, Derek S wrote: > We recently performed an upgrade from pg14 (14.18) to pg16 (16.9) and > performed the analyze-in-stages post upgrade. It has been noticed that > some plans changed to use hash joins instead of nested loops. Further > investigation found it was because the parent table of partitioned > tables did not have stats. After running an ANALYZE on the parent > tables we got similar plan an execution times as before. > > I have two questions > > 1 - Why does analyze-in-stages not analyze the parent tables? > > 2 – What happens if we do not run analyze-in-stages post upgrade and > just run an analyze? It is spelled out in the docs: https://www.postgresql.org/docs/current/pgupgrade.html Emphasis added "Using vacuumdb --all --analyze-only can efficiently generate such statistics, and the use of --jobs can speed it up. Option --analyze-in-stages can be used to generate **minimal statistics** quickly. If vacuum_cost_delay is set to a non-zero value, this can be overridden to speed up statistics generation using PGOPTIONS, e.g., PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb ...." and from here: https://www.postgresql.org/docs/current/app-vacuumdb.html "--analyze-in-stages Only calculate statistics for use by the optimizer (no vacuum), like --analyze-only. Run three stages of analyze; the first stage uses the lowest possible statistics target (see default_statistics_target) to produce usable statistics faster, and subsequent stages build the full statistics. This option is only useful to analyze a database that currently has no statistics or has wholly incorrect ones, such as if it is newly populated from a restored dump or by pg_upgrade. Be aware that running with this option in a database with existing statistics may cause the query optimizer choices to become transiently worse due to the low statistics targets of the early stages. " > > Thanks, > > Sean > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: analyze-in-stages post upgrade questions @ 2025-06-27 21:39 Laurenz Albe <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: Laurenz Albe @ 2025-06-27 21:39 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; Zechman, Derek S <[email protected]>; [email protected] <[email protected]> On Fri, 2025-06-27 at 08:31 -0700, Adrian Klaver wrote: > On 6/27/25 06:35, Zechman, Derek S wrote: > > We recently performed an upgrade from pg14 (14.18) to pg16 (16.9) and > > performed the analyze-in-stages post upgrade. It has been noticed that > > some plans changed to use hash joins instead of nested loops. Further > > investigation found it was because the parent table of partitioned > > tables did not have stats. After running an ANALYZE on the parent > > tables we got similar plan an execution times as before. > > > > I have two questions > > > > 1 - Why does analyze-in-stages not analyze the parent tables? > > > > 2 – What happens if we do not run analyze-in-stages post upgrade and > > just run an analyze? > > It is spelled out in the docs: > > https://www.postgresql.org/docs/current/pgupgrade.html > > Emphasis added > > "Using vacuumdb --all --analyze-only can efficiently generate such > statistics, and the use of --jobs can speed it up. Option > --analyze-in-stages can be used to generate **minimal statistics** > quickly. If vacuum_cost_delay is set to a non-zero value, this can be > overridden to speed up statistics generation using PGOPTIONS, e.g., > PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb ...." > > and from here: > > https://www.postgresql.org/docs/current/app-vacuumdb.html > > "--analyze-in-stages > > Only calculate statistics for use by the optimizer (no vacuum), > like --analyze-only. Run three stages of analyze; the first stage uses > the lowest possible statistics target (see default_statistics_target) to > produce usable statistics faster, and subsequent stages build the full > statistics. > > This option is only useful to analyze a database that currently has > no statistics or has wholly incorrect ones, such as if it is newly > populated from a restored dump or by pg_upgrade. Be aware that running > with this option in a database with existing statistics may cause the > query optimizer choices to become transiently worse due to the low > statistics targets of the early stages. Well, that wouldn't explain why it doesn't work on partitioned tables. I am under the impression that it should. Derek, can cou share the pg_stats entries for the partitioned table? Yours, Laurenz Albe ^ permalink raw reply [nested|flat] 3+ messages in thread
* RE: analyze-in-stages post upgrade questions @ 2025-06-28 01:23 Zechman, Derek S <[email protected]> parent: Laurenz Albe <[email protected]> 0 siblings, 0 replies; 3+ messages in thread From: Zechman, Derek S @ 2025-06-28 01:23 UTC (permalink / raw) To: Laurenz Albe <[email protected]>; Adrian Klaver <[email protected]>; [email protected] <[email protected]> > > We recently performed an upgrade from pg14 (14.18) to pg16 (16.9) and > > performed the analyze-in-stages post upgrade. It has been noticed that > > some plans changed to use hash joins instead of nested loops. Further > > investigation found it was because the parent table of partitioned > > tables did not have stats. After running an ANALYZE on the parent > > tables we got similar plan an execution times as before. > > > > I have two questions > > > > 1 - Why does analyze-in-stages not analyze the parent tables? > > > > 2 – What happens if we do not run analyze-in-stages post upgrade and > > just run an analyze? > > It is spelled out in the docs: > > https://urldefense.com/v3/__https://www.postgresql.org/docs/current/pgupgrade.html__;!!Lf_9VycLqA!jU...; > > Emphasis added > > "Using vacuumdb --all --analyze-only can efficiently generate such > statistics, and the use of --jobs can speed it up. Option > --analyze-in-stages can be used to generate **minimal statistics** > quickly. If vacuum_cost_delay is set to a non-zero value, this can be > overridden to speed up statistics generation using PGOPTIONS, e.g., > PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb ...." > > and from here: > > https://urldefense.com/v3/__https://www.postgresql.org/docs/current/app-vacuumdb.html__;!!Lf_9VycLqA...; > > "--analyze-in-stages > > Only calculate statistics for use by the optimizer (no vacuum), > like --analyze-only. Run three stages of analyze; the first stage uses > the lowest possible statistics target (see default_statistics_target) to > produce usable statistics faster, and subsequent stages build the full > statistics. > > This option is only useful to analyze a database that currently has > no statistics or has wholly incorrect ones, such as if it is newly > populated from a restored dump or by pg_upgrade. Be aware that running > with this option in a database with existing statistics may cause the > query optimizer choices to become transiently worse due to the low > statistics targets of the early stages. Well, that wouldn't explain why it doesn't work on partitioned tables. I am under the impression that it should. Derek, can cou share the pg_stats entries for the partitioned table? Yours, Laurenz Albe There are no entries in pg_stats for the parent table until after I manually run an analyze on it – Example below => select relname, reltuples, relkind from pg_class where relname ~ '^chapter_[0-9]+$' or relname='chapter' order by 1; relname | reltuples | relkind -------------+-----------+--------- chapter | -1 | p chapter_1 | 4 | r chapter_10 | 4 | r chapter_100 | 30 | r chapter_101 | 15 | r chapter_102 | 15 | r … => select count(*) from pg_stats where tablename='chapter'; count ------- 0 (1 row) => analyze chapter; ANALYZE => select relname, reltuples, relkind from pg_class where relkind ='p' and relname='chapter'; relname | reltuples | relkind ---------+-----------+--------- chapter | 7589 | p (1 row) => select count(*) from pg_stats where tablename='chapter'; count ------- 49 (1 row) toy_epc_stg_1_db=> ^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2025-06-28 01:23 UTC | newest] Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-06-27 15:31 Re: analyze-in-stages post upgrade questions Adrian Klaver <[email protected]> 2025-06-27 21:39 ` Laurenz Albe <[email protected]> 2025-06-28 01:23 ` Zechman, Derek S <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox