public inbox for [email protected]  
help / color / mirror / Atom feed
Re: analyze-in-stages post upgrade questions
19+ messages / 5 participants
[nested] [flat]

* Re: analyze-in-stages post upgrade questions
@ 2025-07-09 16:37  Mircea Cadariu <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Mircea Cadariu @ 2025-07-09 16:37 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected] <[email protected]>

Hi Laurenz,

On 09/07/2025 16:26, Laurenz Albe wrote:
> I have added the patch to the current commitfest:
> https://commitfest.postgresql.org/patch/5871/

Just to let you know that I have added a review through the commitfest app.

You can see it here: 
https://www.postgresql.org/message-id/flat/[email protected]...

Kind regards,

Mircea Cadariu


^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-07-09 17:22  Laurenz Albe <[email protected]>
  parent: Mircea Cadariu <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Laurenz Albe @ 2025-07-09 17:22 UTC (permalink / raw)
  To: Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected] <[email protected]>

On Wed, 2025-07-09 at 17:37 +0100, Mircea Cadariu wrote:
> Just to let you know that I have added a review through the commitfest app. 

Thanks!

The patch is still in state "needs review".
If there is something that I should change, you should set it to
"waiting on author".  If you think that the patch is ready to go
as it is, please set it to "ready for committer".

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-07-10 16:20  Mircea Cadariu <[email protected]>
  parent: Laurenz Albe <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Mircea Cadariu @ 2025-07-10 16:20 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected] <[email protected]>

Hi Laurenz,


Got it. I have only one suggestion for the patch. Consider adding a 
corresponding test in src/bin/scripts/t/100_vacuumdb.pl.

Proposal (I used this to check the patch):

$node->safe_psql('postgres',
     "CREATE TABLE parent_table (a INT) PARTITION BY LIST (a);\n"
       . "CREATE TABLE child_table PARTITION OF parent_table FOR VALUES 
IN (1);\n"
       . "INSERT INTO parent_table VALUES (1);\n");
$node->issues_sql_like(
     [
         'vacuumdb', '--analyze-only', 'postgres'
     ],
     qr/statement:\s+ANALYZE\s+public\.parent_table/s,
     '--analyze_only updates statistics for partitioned tables');


Kind regards,

Mircea Cadariu







^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-07-11 09:51  Laurenz Albe <[email protected]>
  parent: Mircea Cadariu <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Laurenz Albe @ 2025-07-11 09:51 UTC (permalink / raw)
  To: Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

[moving to pgsql-hackers]
On Thu, 2025-07-10 at 17:20 +0100, Mircea Cadariu wrote:
>       I have only one suggestion for the patch. Consider adding a 
> corresponding test in src/bin/scripts/t/100_vacuumdb.pl.
> 
> Proposal (I used this to check the patch):
> 
> $node->safe_psql('postgres',
>      "CREATE TABLE parent_table (a INT) PARTITION BY LIST (a);\n"
>        . "CREATE TABLE child_table PARTITION OF parent_table FOR VALUES 
> IN (1);\n"
>        . "INSERT INTO parent_table VALUES (1);\n");
> $node->issues_sql_like(
>      [
>          'vacuumdb', '--analyze-only', 'postgres'
>      ],
>      qr/statement:\s+ANALYZE\s+public\.parent_table/s,
>      '--analyze_only updates statistics for partitioned tables');

Good idea; done in the attached version 2 of the patch.

Yours,
Laurenz Albe


Attachments:

  [text/x-patch] v2-0001-Make-vacuumdb-Z-process-partitioned-tables.patch (4.0K, 2-v2-0001-Make-vacuumdb-Z-process-partitioned-tables.patch)
  download | inline diff:
From 4bf5048da6ffd250ee204fff419cf975aa7a6548 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <[email protected]>
Date: Fri, 11 Jul 2025 11:46:55 +0200
Subject: [PATCH v2] Make vacuumdb -Z process partitioned tables

Autoanalyze won't process partitioned tables, but a manual ANALYZE
of the whole database does.  So it was surprising that partitioned
tables were not processed by "vacuumdb --analyze-only".

In addition, "vacuumdb --analyze-only" is what you run to collect
missing statistics after a "pg_upgrade", and it makes a lot of sense
to collect statistics for partitioned tables at that point.

However, running VACUUM on partitioned tables adds no benefit over
vacuuming the partitions, and VACUUM is more expensive than ANALYZE,
so we won't treat partitioned tables unless --analyze-only is given.
Otherwise, we'd end up vacuuming the partitions twice, which would
be a waste of resources.

Author: Laurenz Albe <[email protected]>
Author: Mircea Cadariu <[email protected]>
Discussion: https://postgr.es/m/CO1PR04MB8281387B9AD9DE30976966BBC045A%40CO1PR04MB8281.namprd04.prod.outlook.com
---
 doc/src/sgml/ref/vacuumdb.sgml    |  4 ++++
 src/bin/scripts/t/100_vacuumdb.pl | 12 ++++++++++++
 src/bin/scripts/vacuumdb.c        | 23 +++++++++++++++++++----
 3 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index b0680a61814..6a4c8d51b7f 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -440,6 +440,10 @@ PostgreSQL documentation
       <listitem>
        <para>
         Only calculate statistics for use by the optimizer (no vacuum).
+        If that option is specified, <command>vacuumdb</command> will also
+        process partitioned tables.  Without that option, only the partitions
+        will be considered, unless a partitioned table is explicitly specified
+        with the <option>--table</option> option.
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index ff56a13b46b..9ab42c39a94 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -340,4 +340,16 @@ $node->issues_sql_unlike(
 	qr/statement:\ ANALYZE/sx,
 	'--missing-stats-only with no missing partition stats');
 
+$node->safe_psql('postgres',
+	"CREATE TABLE parent_table (a INT) PARTITION BY LIST (a);\n"
+	  . "CREATE TABLE child_table PARTITION OF parent_table FOR VALUES
+IN (1);\n"
+	  . "INSERT INTO parent_table VALUES (1);\n");
+$node->issues_sql_like(
+	[
+		'vacuumdb', '--analyze-only', 'postgres'
+	],
+	qr/statement:\s+ANALYZE\s+public\.parent_table/s,
+	'--analyze_only updates statistics for partitioned tables');
+
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 79b1096eb08..d7ca8300f43 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -911,10 +911,25 @@ retrieve_objects(PGconn *conn, vacuumingOptions *vacopts,
 	 */
 	if ((objfilter & OBJFILTER_TABLE) == 0)
 	{
-		appendPQExpBufferStr(&catalog_query,
-							 " AND c.relkind OPERATOR(pg_catalog.=) ANY (array["
-							 CppAsString2(RELKIND_RELATION) ", "
-							 CppAsString2(RELKIND_MATVIEW) "])\n");
+		/*
+		 * VACUUMing partitioned tables would be unreasonably expensive, since
+		 * that entails processing the partitions twice (once as part of the
+		 * partitioned table, once as tables in their own right) for no
+		 * benefit. But if we only ANALYZE, collecting statistics for
+		 * partitioned tables is worth the effort.
+		 */
+		if (vacopts->analyze_only)
+			appendPQExpBufferStr(&catalog_query,
+								 " AND c.relkind OPERATOR(pg_catalog.=) ANY (array["
+								 CppAsString2(RELKIND_RELATION) ", "
+								 CppAsString2(RELKIND_MATVIEW) ", "
+								 CppAsString2(RELKIND_PARTITIONED_TABLE) "])\n");
+		else
+			appendPQExpBufferStr(&catalog_query,
+								 " AND c.relkind OPERATOR(pg_catalog.=) ANY (array["
+								 CppAsString2(RELKIND_RELATION) ", "
+								 CppAsString2(RELKIND_MATVIEW) "])\n");
+
 	}
 
 	/*
-- 
2.50.0



^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-07-11 10:42  Mircea Cadariu <[email protected]>
  parent: Laurenz Albe <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Mircea Cadariu @ 2025-07-11 10:42 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On 11/07/2025 10:51, Laurenz Albe wrote:

> Good idea; done in the attached version 2 of the patch.

Thanks! Looks good. I have set the status of the Commitfest entry to 
"Ready for Committer".


Kind regards,

Mircea Cadariu







^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-07-30 02:49  Fujii Masao <[email protected]>
  parent: Mircea Cadariu <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Fujii Masao @ 2025-07-30 02:49 UTC (permalink / raw)
  To: Mircea Cadariu <[email protected]>; +Cc: Laurenz Albe <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Fri, Jul 11, 2025 at 7:42 PM Mircea Cadariu <[email protected]> wrote:
>
> On 11/07/2025 10:51, Laurenz Albe wrote:
>
> > Good idea; done in the attached version 2 of the patch.
>
> Thanks! Looks good. I have set the status of the Commitfest entry to
> "Ready for Committer".

I've started reviewing the patch since it's marked as ready for committer.

Overall, I like the change. But I have one question: should this be treated as
a bug fix that we back-patch to supported branches, or is it more of
an improvement that should only go into master?


         Only calculate statistics for use by the optimizer (no vacuum).
+        If that option is specified, <command>vacuumdb</command> will also
+        process partitioned tables.  Without that option, only the partitions
+        will be considered, unless a partitioned table is explicitly specified
+        with the <option>--table</option> option.

This wording seems a bit out of place in the --analyze-only section,
since it also describes the default behavior of vacuumdb without that option.
Wouldn't it make more sense to move that explanation in the --table section?

For example, we could add something like:

------------------
If no tables are specified with the --table option, vacuumdb will
clean all regular tables and materialized views in the connected
database. If --analyze-only or --analyze-in-stages is also specified,
it will analyze all regular tables, partitioned tables, and
materialized views (but not foreign tables).
------------------


+ /*
+ * VACUUMing partitioned tables would be unreasonably expensive, since
+ * that entails processing the partitions twice (once as part of the
+ * partitioned table, once as tables in their own right) for no
+ * benefit. But if we only ANALYZE, collecting statistics for
+ * partitioned tables is worth the effort.
+ */

This is probably true. But isn't the main reason more about aligning with
the behavior of the underlying VACUUM and ANALYZE commands? As the vacuumdb
docs says, "There is no effective difference between vacuuming and analyzing
databases via this utility and via other methods for accessing the server.",
so its default target objects should match: VACUUM skips partitioned tables
by default, while ANALYZE includes them. If that's the case, maybe the comment
should reflect that instead.


+ qr/statement:\s+ANALYZE\s+public\.parent_table/s,
+ '--analyze_only updates statistics for partitioned tables');

A plain space might be sufficient instead of \s+.
Also, I don't think the backslash before ".parent_table" is necessary.

Regards,

-- 
Fujii Masao






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-06 04:01  Mircea Cadariu <[email protected]>
  parent: Fujii Masao <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Mircea Cadariu @ 2025-08-06 04:01 UTC (permalink / raw)
  To: Fujii Masao <[email protected]>; +Cc: Laurenz Albe <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

Hi,

On 30/07/2025 12:49, Fujii Masao wrote:
> I've started reviewing the patch since it's marked as ready for committer.
Thanks!
> Overall, I like the change. But I have one question: should this be treated as
> a bug fix that we back-patch to supported branches, or is it more of
> an improvement that should only go into master?
I reckon it might make sense to back-patch it to previous versions, as 
users might not upgrade always to the latest version.
>           Only calculate statistics for use by the optimizer (no vacuum).
> +        If that option is specified, <command>vacuumdb</command> will also
> +        process partitioned tables.  Without that option, only the partitions
> +        will be considered, unless a partitioned table is explicitly specified
> +        with the <option>--table</option> option.
>
> This wording seems a bit out of place in the --analyze-only section,
> since it also describes the default behavior of vacuumdb without that option.
> Wouldn't it make more sense to move that explanation in the --table section?
>
> For example, we could add something like:
>
> ------------------
> If no tables are specified with the --table option, vacuumdb will
> clean all regular tables and materialized views in the connected
> database. If --analyze-only or --analyze-in-stages is also specified,
> it will analyze all regular tables, partitioned tables, and
> materialized views (but not foreign tables).
> ------------------
Yes, agreed.
> + /*
> + * VACUUMing partitioned tables would be unreasonably expensive, since
> + * that entails processing the partitions twice (once as part of the
> + * partitioned table, once as tables in their own right) for no
> + * benefit. But if we only ANALYZE, collecting statistics for
> + * partitioned tables is worth the effort.
> + */
>
> This is probably true. But isn't the main reason more about aligning with
> the behavior of the underlying VACUUM and ANALYZE commands? As the vacuumdb
> docs says, "There is no effective difference between vacuuming and analyzing
> databases via this utility and via other methods for accessing the server.",
> so its default target objects should match: VACUUM skips partitioned tables
> by default, while ANALYZE includes them. If that's the case, maybe the comment
> should reflect that instead.
I see what you mean. From that perspective, I wonder if we even need a 
comment there at all.
> + qr/statement:\s+ANALYZE\s+public\.parent_table/s,
> + '--analyze_only updates statistics for partitioned tables');
>
> A plain space might be sufficient instead of \s+.
> Also, I don't think the backslash before ".parent_table" is necessary.

Good catch! Indeed let's simplify that to contain strictly only what's 
necessary.

Kind regards,

Mircea Cadariu


^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-06 14:25  Fujii Masao <[email protected]>
  parent: Mircea Cadariu <[email protected]>
  0 siblings, 2 replies; 19+ messages in thread

From: Fujii Masao @ 2025-08-06 14:25 UTC (permalink / raw)
  To: Mircea Cadariu <[email protected]>; +Cc: Laurenz Albe <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Wed, Aug 6, 2025 at 1:01 PM Mircea Cadariu <[email protected]> wrote:
> Overall, I like the change. But I have one question: should this be treated as
> a bug fix that we back-patch to supported branches, or is it more of
> an improvement that should only go into master?
>
> I reckon it might make sense to back-patch it to previous versions, as users might not upgrade always to the latest version.

I understand your point. But on second thought, since the patch changes
behavior, I'm leaning toward treating it as an improvement, so it should
only go to master...


> + /*
> + * VACUUMing partitioned tables would be unreasonably expensive, since
> + * that entails processing the partitions twice (once as part of the
> + * partitioned table, once as tables in their own right) for no
> + * benefit. But if we only ANALYZE, collecting statistics for
> + * partitioned tables is worth the effort.
> + */
>
> This is probably true. But isn't the main reason more about aligning with
> the behavior of the underlying VACUUM and ANALYZE commands? As the vacuumdb
> docs says, "There is no effective difference between vacuuming and analyzing
> databases via this utility and via other methods for accessing the server.",
> so its default target objects should match: VACUUM skips partitioned tables
> by default, while ANALYZE includes them. If that's the case, maybe the comment
> should reflect that instead.
>
> I see what you mean. From that perspective, I wonder if we even need a comment there at all.

Or, if we keep it, though, I'd like to update it to something like
the following:

--------------------
vacuumdb should generally follow the behavior of the underlying
VACUUM and ANALYZE commands. If analyze_only is true, process
regular tables, materialized views, and partitioned tables, just like
ANALYZE (with no specific target tables) does. Otherwise, process
only regular tables and materialized views, since VACUUM skips
partitioned tables when no target tables are specified.
--------------------

Regards,

-- 
Fujii Masao






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-06 18:14  Nathan Bossart <[email protected]>
  parent: Fujii Masao <[email protected]>
  1 sibling, 1 reply; 19+ messages in thread

From: Nathan Bossart @ 2025-08-06 18:14 UTC (permalink / raw)
  To: Fujii Masao <[email protected]>; +Cc: Mircea Cadariu <[email protected]>; Laurenz Albe <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Wed, Aug 06, 2025 at 11:25:53PM +0900, Fujii Masao wrote:
> On Wed, Aug 6, 2025 at 1:01 PM Mircea Cadariu <[email protected]> wrote:
>> Overall, I like the change. But I have one question: should this be treated as
>> a bug fix that we back-patch to supported branches, or is it more of
>> an improvement that should only go into master?
>>
>> I reckon it might make sense to back-patch it to previous versions, as users might not upgrade always to the latest version.
> 
> I understand your point. But on second thought, since the patch changes
> behavior, I'm leaning toward treating it as an improvement, so it should
> only go to master...

I also am leaning towards treating this as v19 material.  It's a nontrivial
behavior change, and this option is useful for major version upgrades,
which is an area that we really don't want to surprise users too much.
Furthermore, auto-analyze doesn't process partitioned tables, either, so
this introduces a bit of divergence.  (I'd love to see that project picked
up again someday.  Perhaps I will take a gander...)

-- 
nathan






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-06 20:52  Laurenz Albe <[email protected]>
  parent: Fujii Masao <[email protected]>
  1 sibling, 1 reply; 19+ messages in thread

From: Laurenz Albe @ 2025-08-06 20:52 UTC (permalink / raw)
  To: Fujii Masao <[email protected]>; Mircea Cadariu <[email protected]>; +Cc: Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Wed, 2025-08-06 at 23:25 +0900, Fujii Masao wrote:
> On Wed, Aug 6, 2025 at 1:01 PM Mircea Cadariu <[email protected]> wrote:
> > Overall, I like the change. But I have one question: should this be treated as
> > a bug fix that we back-patch to supported branches, or is it more of
> > an improvement that should only go into master?
> > 
> > I reckon it might make sense to back-patch it to previous versions, as users might not upgrade always to the latest version.
> 
> I understand your point. But on second thought, since the patch changes
> behavior, I'm leaning toward treating it as an improvement, so it should
> only go to master...

I agree that this behavior change should not be backpatched.
That is not a bugfix.

> > + /*
> > + * VACUUMing partitioned tables would be unreasonably expensive, since
> > + * that entails processing the partitions twice (once as part of the
> > + * partitioned table, once as tables in their own right) for no
> > + * benefit. But if we only ANALYZE, collecting statistics for
> > + * partitioned tables is worth the effort.
> > + */
> > 
> > This is probably true. But isn't the main reason more about aligning with
> > the behavior of the underlying VACUUM and ANALYZE commands? As the vacuumdb
> > docs says, "There is no effective difference between vacuuming and analyzing
> > databases via this utility and via other methods for accessing the server.",
> > so its default target objects should match: VACUUM skips partitioned tables
> > by default, while ANALYZE includes them. If that's the case, maybe the comment
> > should reflect that instead.
> > 
> > I see what you mean. From that perspective, I wonder if we even need a comment there at all.
> 
> Or, if we keep it, though, I'd like to update it to something like
> the following:
> 
> --------------------
> vacuumdb should generally follow the behavior of the underlying
> VACUUM and ANALYZE commands. If analyze_only is true, process
> regular tables, materialized views, and partitioned tables, just like
> ANALYZE (with no specific target tables) does. Otherwise, process
> only regular tables and materialized views, since VACUUM skips
> partitioned tables when no target tables are specified.
> --------------------

I am fine with that suggestion.

Alternatively, my original comment could be amended with

  Besides, ANALYZE (without an option) processes partitioned tables, and
  "vacuumdb -Z" should behave like ANALYZE.

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-18 02:36  Fujii Masao <[email protected]>
  parent: Nathan Bossart <[email protected]>
  0 siblings, 0 replies; 19+ messages in thread

From: Fujii Masao @ 2025-08-18 02:36 UTC (permalink / raw)
  To: Nathan Bossart <[email protected]>; +Cc: Mircea Cadariu <[email protected]>; Laurenz Albe <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Thu, Aug 7, 2025 at 3:14 AM Nathan Bossart <[email protected]> wrote:
> I also am leaning towards treating this as v19 material.  It's a nontrivial
> behavior change, and this option is useful for major version upgrades,
> which is an area that we really don't want to surprise users too much.

+1

> Furthermore, auto-analyze doesn't process partitioned tables, either, so
> this introduces a bit of divergence.  (I'd love to see that project picked
> up again someday.  Perhaps I will take a gander...)

Sounds good!

Regards,

-- 
Fujii Masao






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-18 02:38  Fujii Masao <[email protected]>
  parent: Laurenz Albe <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Fujii Masao @ 2025-08-18 02:38 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Thu, Aug 7, 2025 at 5:52 AM Laurenz Albe <[email protected]> wrote:
> > I understand your point. But on second thought, since the patch changes
> > behavior, I'm leaning toward treating it as an improvement, so it should
> > only go to master...
>
> I agree that this behavior change should not be backpatched.
> That is not a bugfix.

+1

> > --------------------
> > vacuumdb should generally follow the behavior of the underlying
> > VACUUM and ANALYZE commands. If analyze_only is true, process
> > regular tables, materialized views, and partitioned tables, just like
> > ANALYZE (with no specific target tables) does. Otherwise, process
> > only regular tables and materialized views, since VACUUM skips
> > partitioned tables when no target tables are specified.
> > --------------------
>
> I am fine with that suggestion.

Thanks! So I've updated the patch based on my earlier comments.
Unless there are objections, I'll commit the attached version to master only.

Regards,

-- 
Fujii Masao


Attachments:

  [application/octet-stream] v3-0001-vacuumdb-Make-vacuumdb-analyze-only-process-parti.patch (4.8K, 2-v3-0001-vacuumdb-Make-vacuumdb-analyze-only-process-parti.patch)
  download | inline diff:
From 5c969379514aed6de612165f39fa709740a56933 Mon Sep 17 00:00:00 2001
From: Fujii Masao <[email protected]>
Date: Mon, 18 Aug 2025 10:41:15 +0900
Subject: [PATCH v3] vacuumdb: Make vacuumdb --analyze-only process partitioned
 tables.

vacuumdb should follow the behavior of the underlying VACUUM and ANALYZE
commands. When --analyze-only is used, it ought to analyze regular tables,
materialized views, and partitioned tables, just as ANALYZE (with no explicit
target tables) does. Otherwise, it should only process regular tables and
materialized views, since VACUUM skips partitioned tables when no targets
are given.

Previously, vacuumdb --analyze-only skipped partitioned tables. This was
inconsistent, and also inconvenient after pg_upgrade, where --analyze-only
is typically used to gather missing statistics.

This commit fixes the behavior so that vacuumdb --analyze-only also processes
partitioned tables. As a result, both vacuumdb --analyze-only and
ANALYZE (with no explicit targets) now analyze regular tables,
partitioned tables, and materialized views, but not foreign tables.

Because this is a nontrivial behavior change, it is applied only to master.

Reported-by: Zechman, Derek S <[email protected]>
Author: Laurenz Albe <[email protected]>
Co-authored-by: Mircea Cadariu <[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/CO1PR04MB8281387B9AD9DE30976966BBC045A%40CO1PR04MB8281.namprd04.prod.outlook.com
---
 doc/src/sgml/ref/vacuumdb.sgml    |  9 +++++++++
 src/bin/scripts/t/100_vacuumdb.pl | 11 +++++++++++
 src/bin/scripts/vacuumdb.c        | 24 ++++++++++++++++++++----
 3 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index c7d9dca17b8..53147480515 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -397,6 +397,15 @@ PostgreSQL documentation
         Multiple tables can be vacuumed by writing multiple
         <option>-t</option> switches.
        </para>
+       <para>
+        If no tables are specified with the <option>--table</option> option,
+        <application>vacuumdb</application> will clean all regular tables
+        and materialized views in the connected database.
+        If <option>--analyze-only</option> or
+        <option>--analyze-in-stages</option> is also specified,
+        it will analyze all regular tables, partitioned tables,
+        and materialized views (but not foreign tables).
+       </para>
        <tip>
         <para>
          If you specify columns, you probably have to escape the parentheses
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index ff56a13b46b..240f0fdd3e5 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -340,4 +340,15 @@ $node->issues_sql_unlike(
 	qr/statement:\ ANALYZE/sx,
 	'--missing-stats-only with no missing partition stats');
 
+$node->safe_psql('postgres',
+	"CREATE TABLE parent_table (a INT) PARTITION BY LIST (a);\n"
+	  . "CREATE TABLE child_table PARTITION OF parent_table FOR VALUES IN (1);\n"
+	  . "INSERT INTO parent_table VALUES (1);\n");
+$node->issues_sql_like(
+	[
+		'vacuumdb', '--analyze-only', 'postgres'
+	],
+	qr/statement: ANALYZE public.parent_table/s,
+	'--analyze-only updates statistics for partitioned tables');
+
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 79b1096eb08..22093e50aa5 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -911,10 +911,26 @@ retrieve_objects(PGconn *conn, vacuumingOptions *vacopts,
 	 */
 	if ((objfilter & OBJFILTER_TABLE) == 0)
 	{
-		appendPQExpBufferStr(&catalog_query,
-							 " AND c.relkind OPERATOR(pg_catalog.=) ANY (array["
-							 CppAsString2(RELKIND_RELATION) ", "
-							 CppAsString2(RELKIND_MATVIEW) "])\n");
+		/*
+		 * vacuumdb should generally follow the behavior of the underlying
+		 * VACUUM and ANALYZE commands. If analyze_only is true, process
+		 * regular tables, materialized views, and partitioned tables, just
+		 * like ANALYZE (with no specific target tables) does. Otherwise,
+		 * process only regular tables and materialized views, since VACUUM
+		 * skips partitioned tables when no target tables are specified.
+		 */
+		if (vacopts->analyze_only)
+			appendPQExpBufferStr(&catalog_query,
+								 " AND c.relkind OPERATOR(pg_catalog.=) ANY (array["
+								 CppAsString2(RELKIND_RELATION) ", "
+								 CppAsString2(RELKIND_MATVIEW) ", "
+								 CppAsString2(RELKIND_PARTITIONED_TABLE) "])\n");
+		else
+			appendPQExpBufferStr(&catalog_query,
+								 " AND c.relkind OPERATOR(pg_catalog.=) ANY (array["
+								 CppAsString2(RELKIND_RELATION) ", "
+								 CppAsString2(RELKIND_MATVIEW) "])\n");
+
 	}
 
 	/*
-- 
2.50.1



^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-18 06:40  Laurenz Albe <[email protected]>
  parent: Fujii Masao <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Laurenz Albe @ 2025-08-18 06:40 UTC (permalink / raw)
  To: Fujii Masao <[email protected]>; +Cc: Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Mon, 2025-08-18 at 11:38 +0900, Fujii Masao wrote:
> Thanks! So I've updated the patch based on my earlier comments.
> Unless there are objections, I'll commit the attached version to master only.

I am fine with your patch.

One suggestion:

> --- a/doc/src/sgml/ref/vacuumdb.sgml
> +++ b/doc/src/sgml/ref/vacuumdb.sgml
> @@ -397,6 +397,15 @@ PostgreSQL documentation
>          Multiple tables can be vacuumed by writing multiple
>          <option>-t</option> switches.
>         </para>
> +       <para>
> +        If no tables are specified with the <option>--table</option> option,
> +        <application>vacuumdb</application> will clean all regular tables
> +        and materialized views in the connected database.
> +        If <option>--analyze-only</option> or
> +        <option>--analyze-in-stages</option> is also specified,
> +        it will analyze all regular tables, partitioned tables,
> +        and materialized views (but not foreign tables).
> +       </para>

I suggest replacing "clean" with "process", since VACUUM does so much more than
clean up dead tuples.

Concerning backpatching, I voted against, but I suggest that this be backpatched
to v18.  I don't feel very strongly about it though.

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-19 14:40  Fujii Masao <[email protected]>
  parent: Laurenz Albe <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Fujii Masao @ 2025-08-19 14:40 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Mon, Aug 18, 2025 at 3:40 PM Laurenz Albe <[email protected]> wrote:
>
> On Mon, 2025-08-18 at 11:38 +0900, Fujii Masao wrote:
> > Thanks! So I've updated the patch based on my earlier comments.
> > Unless there are objections, I'll commit the attached version to master only.
>
> I am fine with your patch.

Thanks for the review!


> One suggestion:
>
> > --- a/doc/src/sgml/ref/vacuumdb.sgml
> > +++ b/doc/src/sgml/ref/vacuumdb.sgml
> > @@ -397,6 +397,15 @@ PostgreSQL documentation
> >          Multiple tables can be vacuumed by writing multiple
> >          <option>-t</option> switches.
> >         </para>
> > +       <para>
> > +        If no tables are specified with the <option>--table</option> option,
> > +        <application>vacuumdb</application> will clean all regular tables
> > +        and materialized views in the connected database.
> > +        If <option>--analyze-only</option> or
> > +        <option>--analyze-in-stages</option> is also specified,
> > +        it will analyze all regular tables, partitioned tables,
> > +        and materialized views (but not foreign tables).
> > +       </para>
>
> I suggest replacing "clean" with "process", since VACUUM does so much more than
> clean up dead tuples.

I see your point. However, since the vacuumdb docs already use "clean"
in several places, I think it's better to keep using "clean" here
for consistency. Thought?


> Concerning backpatching, I voted against, but I suggest that this be backpatched
> to v18.  I don't feel very strongly about it though.

As for back-patching, I failed to find a strong reason to apply this change
to v18 over the many other patches that could not be committed before
the feature freeze... Of course if there's broad support for back-patching,
we can certainly revisit it. But for now I'm thinking to commit the patch
to master.

Regards,

-- 
Fujii Masao






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-19 15:16  Laurenz Albe <[email protected]>
  parent: Fujii Masao <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Laurenz Albe @ 2025-08-19 15:16 UTC (permalink / raw)
  To: Fujii Masao <[email protected]>; +Cc: Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Tue, 2025-08-19 at 23:40 +0900, Fujii Masao wrote:
> > > --- a/doc/src/sgml/ref/vacuumdb.sgml
> > > +++ b/doc/src/sgml/ref/vacuumdb.sgml
> > > @@ -397,6 +397,15 @@ PostgreSQL documentation
> > >           Multiple tables can be vacuumed by writing multiple
> > >           <option>-t</option> switches.
> > >          </para>
> > > +       <para>
> > > +        If no tables are specified with the <option>--table</option> option,
> > > +        <application>vacuumdb</application> will clean all regular tables
> > > +        and materialized views in the connected database.
> > > +        If <option>--analyze-only</option> or
> > > +        <option>--analyze-in-stages</option> is also specified,
> > > +        it will analyze all regular tables, partitioned tables,
> > > +        and materialized views (but not foreign tables).
> > > +       </para>
> > 
> > I suggest replacing "clean" with "process", since VACUUM does so much more than
> > clean up dead tuples.
> 
> I see your point. However, since the vacuumdb docs already use "clean"
> in several places, I think it's better to keep using "clean" here
> for consistency. Thought?

Works for me; I didn't consider that.

> > Concerning backpatching, I voted against, but I suggest that this be backpatched
> > to v18.  I don't feel very strongly about it though.
> 
> As for back-patching, I failed to find a strong reason to apply this change
> to v18 over the many other patches that could not be committed before
> the feature freeze... Of course if there's broad support for back-patching,
> we can certainly revisit it. But for now I'm thinking to commit the patch
> to master.

I don't have a strong reason either - my reasoning was that the change is small
and unlikely to introduce a bug, and that it would be nice to get more accurate
statistics on partitioned tables after "pg_upgrade" a year earlier.

But I won't object if the patch is only in v19.

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-20 04:31  Fujii Masao <[email protected]>
  parent: Laurenz Albe <[email protected]>
  0 siblings, 2 replies; 19+ messages in thread

From: Fujii Masao @ 2025-08-20 04:31 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Wed, Aug 20, 2025 at 12:16 AM Laurenz Albe <[email protected]> wrote:
>
> On Tue, 2025-08-19 at 23:40 +0900, Fujii Masao wrote:
> > > > --- a/doc/src/sgml/ref/vacuumdb.sgml
> > > > +++ b/doc/src/sgml/ref/vacuumdb.sgml
> > > > @@ -397,6 +397,15 @@ PostgreSQL documentation
> > > >           Multiple tables can be vacuumed by writing multiple
> > > >           <option>-t</option> switches.
> > > >          </para>
> > > > +       <para>
> > > > +        If no tables are specified with the <option>--table</option> option,
> > > > +        <application>vacuumdb</application> will clean all regular tables
> > > > +        and materialized views in the connected database.
> > > > +        If <option>--analyze-only</option> or
> > > > +        <option>--analyze-in-stages</option> is also specified,
> > > > +        it will analyze all regular tables, partitioned tables,
> > > > +        and materialized views (but not foreign tables).
> > > > +       </para>
> > >
> > > I suggest replacing "clean" with "process", since VACUUM does so much more than
> > > clean up dead tuples.
> >
> > I see your point. However, since the vacuumdb docs already use "clean"
> > in several places, I think it's better to keep using "clean" here
> > for consistency. Thought?
>
> Works for me; I didn't consider that.
>
> > > Concerning backpatching, I voted against, but I suggest that this be backpatched
> > > to v18.  I don't feel very strongly about it though.
> >
> > As for back-patching, I failed to find a strong reason to apply this change
> > to v18 over the many other patches that could not be committed before
> > the feature freeze... Of course if there's broad support for back-patching,
> > we can certainly revisit it. But for now I'm thinking to commit the patch
> > to master.
>
> I don't have a strong reason either - my reasoning was that the change is small
> and unlikely to introduce a bug, and that it would be nice to get more accurate
> statistics on partitioned tables after "pg_upgrade" a year earlier.
>
> But I won't object if the patch is only in v19.

OK, so for now I've pushed the patch to master. Thanks!

Regards,

-- 
Fujii Masao






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-20 07:14  Laurenz Albe <[email protected]>
  parent: Fujii Masao <[email protected]>
  1 sibling, 0 replies; 19+ messages in thread

From: Laurenz Albe @ 2025-08-20 07:14 UTC (permalink / raw)
  To: Fujii Masao <[email protected]>; +Cc: Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

On Wed, 2025-08-20 at 13:31 +0900, Fujii Masao wrote:
> OK, so for now I've pushed the patch to master. Thanks!

Thank you for working on that!

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2025-08-21 01:51  Justin Pryzby <[email protected]>
  parent: Fujii Masao <[email protected]>
  1 sibling, 1 reply; 19+ messages in thread

From: Justin Pryzby @ 2025-08-21 01:51 UTC (permalink / raw)
  To: Fujii Masao <[email protected]>; +Cc: Laurenz Albe <[email protected]>; Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

When analyzing a partitioned table, I think you should use ANALYZE ONLY,
or otherwise avoid processing the children twice.

Thanks for handling this.  I was recently suprised to learn that
vacuumdb doesn't process parents.

-- 
Justin






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: analyze-in-stages post upgrade questions
@ 2026-06-04 17:14  Justin Pryzby <[email protected]>
  parent: Justin Pryzby <[email protected]>
  0 siblings, 0 replies; 19+ messages in thread

From: Justin Pryzby @ 2026-06-04 17:14 UTC (permalink / raw)
  To: Fujii Masao <[email protected]>; +Cc: Laurenz Albe <[email protected]>; Mircea Cadariu <[email protected]>; Zechman, Derek S <[email protected]>; Adrian Klaver <[email protected]>; [email protected]

Checking back.

On Wed, Aug 20, 2025 at 08:51:16PM -0500, Justin Pryzby wrote:
> When analyzing a partitioned table, I think you should use ANALYZE ONLY,
> or otherwise avoid processing the children twice.
> 
> Thanks for handling this.  I was recently suprised to learn that
> vacuumdb doesn't process parents.






^ permalink  raw  reply  [nested|flat] 19+ messages in thread


end of thread, other threads:[~2026-06-04 17:14 UTC | newest]

Thread overview: 19+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-07-09 16:37 Re: analyze-in-stages post upgrade questions Mircea Cadariu <[email protected]>
2025-07-09 17:22 ` Laurenz Albe <[email protected]>
2025-07-10 16:20   ` Mircea Cadariu <[email protected]>
2025-07-11 09:51     ` Laurenz Albe <[email protected]>
2025-07-11 10:42       ` Mircea Cadariu <[email protected]>
2025-07-30 02:49         ` Fujii Masao <[email protected]>
2025-08-06 04:01           ` Mircea Cadariu <[email protected]>
2025-08-06 14:25             ` Fujii Masao <[email protected]>
2025-08-06 18:14               ` Nathan Bossart <[email protected]>
2025-08-18 02:36                 ` Fujii Masao <[email protected]>
2025-08-06 20:52               ` Laurenz Albe <[email protected]>
2025-08-18 02:38                 ` Fujii Masao <[email protected]>
2025-08-18 06:40                   ` Laurenz Albe <[email protected]>
2025-08-19 14:40                     ` Fujii Masao <[email protected]>
2025-08-19 15:16                       ` Laurenz Albe <[email protected]>
2025-08-20 04:31                         ` Fujii Masao <[email protected]>
2025-08-20 07:14                           ` Laurenz Albe <[email protected]>
2025-08-21 01:51                           ` Justin Pryzby <[email protected]>
2026-06-04 17:14                             ` Justin Pryzby <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox