public inbox for [email protected]
help / color / mirror / Atom feedInfinite Autovacuum loop caused by failing virtual generated column expression
7+ messages / 3 participants
[nested] [flat]
* Infinite Autovacuum loop caused by failing virtual generated column expression
@ 2026-04-10 20:18 SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 1 reply; 7+ messages in thread
From: SATYANARAYANA NARLAPURAM @ 2026-04-10 20:18 UTC (permalink / raw)
To: PostgreSQL Hackers <[email protected]>; [email protected]
Hi Hackers,
PG19 added support for stats on virtual generated columns [1]. Creating
extended statistics on a virtual generated column whose expression can
raise an error leads to ANALYZE failing repeatedly, and autovacuum retrying
indefinitely. This floods the server logs and also wastes resources. Vacuum
analyze on that column (without extended stats) succeeds.
In order to avoid retry storms, I think we have two options. (1)
skipping the offending row from the sample, (2) skipping the extended stats
computation for that table with a warning message. At least this avoid
autovacuum infinite retry. Attached a draft patch for the option (2).
Thoughts?
Repro:
CREATE TABLE t (
id int PRIMARY KEY,
a int,
gen int GENERATED ALWAYS AS (100 / a) VIRTUAL
);
INSERT INTO t VALUES (1, 10), (2, 5), (3, 0);
-- This succeeds (per-column stats don't evaluate the expression for
every row)
ANALYZE t;
-- Add extended statistics referencing the virtual gen col
CREATE STATISTICS t_stat ON a, gen FROM t;
-- This fails
ANALYZE t;
-- ERROR: division by zero
-- this succeeds
ANALYZE t(gen)
[1]:
https://www.postgresql.org/message-id/flat/20250422181006.dd6f9d1d81299f5b2ad55e1a%40sraoss.co.jp
Thanks,
Satya
Attachments:
[application/octet-stream] v1-0001-fix-analyze-extended-stats-virtual-gen-col.patch (5.1K, 3-v1-0001-fix-analyze-extended-stats-virtual-gen-col.patch)
download | inline diff:
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 2b83355d..ba25dd62 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -33,6 +33,7 @@
#include "pgstat.h"
#include "postmaster/autovacuum.h"
#include "rewrite/rewriteHandler.h"
+#include "utils/resowner.h"
#include "statistics/extended_stats_internal.h"
#include "statistics/statistics.h"
#include "utils/acl.h"
@@ -194,42 +195,112 @@ BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
if (stattarget == 0)
continue;
- /* evaluate expressions (if the statistics object has any) */
- data = make_build_data(onerel, stat, numrows, rows, stats, stattarget);
-
- /* compute statistic of each requested type */
- foreach(lc2, stat->types)
+ /*
+ * Wrap expression evaluation and stats computation in PG_TRY so
+ * that errors from evaluating virtual generated column expressions
+ * (e.g. division by zero) don't cause ANALYZE to fail entirely.
+ * Skip the statistics object and issue a WARNING instead.
+ *
+ * Used child ResourceOwner so that any resources allocated
+ * during expression evaluation are properly released on
+ * error without leaking.
+ */
{
- char t = (char) lfirst_int(lc2);
-
- if (t == STATS_EXT_NDISTINCT)
- ndistinct = statext_ndistinct_build(totalrows, data);
- else if (t == STATS_EXT_DEPENDENCIES)
- dependencies = statext_dependencies_build(data);
- else if (t == STATS_EXT_MCV)
- mcv = statext_mcv_build(data, totalrows, stattarget);
- else if (t == STATS_EXT_EXPRESSIONS)
+ ResourceOwner oldowner = CurrentResourceOwner;
+ ResourceOwner child;
+
+ child = ResourceOwnerCreate(oldowner, "BuildExtStatistics");
+ CurrentResourceOwner = child;
+
+ PG_TRY();
{
- AnlExprData *exprdata;
- int nexprs;
+ /* evaluate expressions (if the statistics has any) */
+ data = make_build_data(onerel, stat, numrows, rows,
+ stats, stattarget);
- /* should not happen, thanks to checks when defining stats */
- if (!stat->exprs)
- elog(ERROR, "requested expression stats, but there are no expressions");
+ /* compute statistic of each requested type */
+ foreach(lc2, stat->types)
+ {
+ char t = (char) lfirst_int(lc2);
+
+ if (t == STATS_EXT_NDISTINCT)
+ ndistinct = statext_ndistinct_build(totalrows, data);
+ else if (t == STATS_EXT_DEPENDENCIES)
+ dependencies = statext_dependencies_build(data);
+ else if (t == STATS_EXT_MCV)
+ mcv = statext_mcv_build(data, totalrows, stattarget);
+ else if (t == STATS_EXT_EXPRESSIONS)
+ {
+ AnlExprData *exprdata;
+ int nexprs;
+
+ /* should not happen, thanks to checks */
+ if (!stat->exprs)
+ elog(ERROR, "requested expression stats, but there are no expressions");
+
+ exprdata = build_expr_data(stat->exprs, stattarget);
+ nexprs = list_length(stat->exprs);
+
+ compute_expr_stats(onerel, exprdata, nexprs,
+ rows, numrows);
+
+ exprstats = serialize_expr_stats(exprdata, nexprs);
+ }
+ }
- exprdata = build_expr_data(stat->exprs, stattarget);
- nexprs = list_length(stat->exprs);
+ /* store the statistics in the catalog */
+ statext_store(stat->statOid, inh,
+ ndistinct, dependencies, mcv, exprstats, stats);
+
+ /* Success: release child ResourceOwner normally */
+ CurrentResourceOwner = oldowner;
+ ResourceOwnerRelease(child,
+ RESOURCE_RELEASE_BEFORE_LOCKS,
+ false, false);
+ ResourceOwnerRelease(child,
+ RESOURCE_RELEASE_LOCKS,
+ false, false);
+ ResourceOwnerRelease(child,
+ RESOURCE_RELEASE_AFTER_LOCKS,
+ false, false);
+ ResourceOwnerDelete(child);
+ }
+ PG_CATCH();
+ {
+ ErrorData *edata;
+
+ /* Release leaked resources from the child ResourceOwner */
+ CurrentResourceOwner = child;
+ ResourceOwnerRelease(child,
+ RESOURCE_RELEASE_BEFORE_LOCKS,
+ false, false);
+ ResourceOwnerRelease(child,
+ RESOURCE_RELEASE_LOCKS,
+ false, false);
+ ResourceOwnerRelease(child,
+ RESOURCE_RELEASE_AFTER_LOCKS,
+ false, false);
+ CurrentResourceOwner = oldowner;
+ ResourceOwnerDelete(child);
+
+ /* Save the error, issue a WARNING and continue */
+ MemoryContextSwitchTo(cxt);
+ edata = CopyErrorData();
+ FlushErrorState();
- compute_expr_stats(onerel, exprdata, nexprs, rows, numrows);
+ ereport(WARNING,
+ (errcode(ERRCODE_WARNING),
+ errmsg("skipping statistics object \"%s.%s\" for relation \"%s.%s\"",
+ stat->schema, stat->name,
+ get_namespace_name(onerel->rd_rel->relnamespace),
+ RelationGetRelationName(onerel)),
+ errdetail("%s", edata->message)));
- exprstats = serialize_expr_stats(exprdata, nexprs);
+ FreeErrorData(edata);
}
+ PG_END_TRY();
}
- /* store the statistics in the catalog */
- statext_store(stat->statOid, inh,
- ndistinct, dependencies, mcv, exprstats, stats);
-
/* for reporting progress */
pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
++ext_cnt);
[application/octet-stream] v1-0001-test-analyze-extended-stats-virtual-gen-col.patch (1.9K, 4-v1-0001-test-analyze-extended-stats-virtual-gen-col.patch)
download | inline diff:
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 37070c1a..7234d6b6 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -3207,6 +3207,14 @@ SELECT * FROM check_estimated_rows('SELECT * FROM virtual_gen_stats WHERE w = 0'
(1 row)
DROP TABLE virtual_gen_stats;
+-- extended statistics on virtual generated columns whose expressions can error
+CREATE TABLE virtual_gen_err (a int, b int GENERATED ALWAYS AS (a / 0) VIRTUAL);
+INSERT INTO virtual_gen_err VALUES (1), (2), (3);
+CREATE STATISTICS virtual_gen_err_s ON a, b FROM virtual_gen_err;
+ANALYZE virtual_gen_err; -- should warn, not fail
+WARNING: skipping statistics object "public.virtual_gen_err_s" for relation "public.virtual_gen_err"
+DETAIL: division by zero
+DROP TABLE virtual_gen_err;
-- Permission tests. Users should not be able to see specific data values in
-- the extended statistics, if they lack permission to see those values in
-- the underlying table.
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 3cc6012b..10918ffb 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -1608,6 +1608,13 @@ SELECT * FROM check_estimated_rows('SELECT * FROM virtual_gen_stats WHERE w = 0'
DROP TABLE virtual_gen_stats;
+-- extended statistics on virtual generated columns whose expressions can error
+CREATE TABLE virtual_gen_err (a int, b int GENERATED ALWAYS AS (a / 0) VIRTUAL);
+INSERT INTO virtual_gen_err VALUES (1), (2), (3);
+CREATE STATISTICS virtual_gen_err_s ON a, b FROM virtual_gen_err;
+ANALYZE virtual_gen_err; -- should warn, not fail
+DROP TABLE virtual_gen_err;
+
-- Permission tests. Users should not be able to see specific data values in
-- the extended statistics, if they lack permission to see those values in
-- the underlying table.
^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: Infinite Autovacuum loop caused by failing virtual generated column expression
@ 2026-04-11 16:33 Dean Rasheed <[email protected]>
parent: SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 1 reply; 7+ messages in thread
From: Dean Rasheed @ 2026-04-11 16:33 UTC (permalink / raw)
To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; [email protected]
On Fri, 10 Apr 2026 at 21:19, SATYANARAYANA NARLAPURAM
<[email protected]> wrote:
>
> PG19 added support for stats on virtual generated columns [1]. Creating extended statistics on a virtual generated column whose expression can raise an error leads to ANALYZE failing repeatedly, and autovacuum retrying indefinitely. This floods the server logs and also wastes resources. Vacuum analyze on that column (without extended stats) succeeds.
>
True, though this is nothing new. The same thing can happen with
expression statistics on an expression that raises an error, which has
been possible since PG14.
> In order to avoid retry storms, I think we have two options. (1) skipping the offending row from the sample, (2) skipping the extended stats computation for that table with a warning message. At least this avoid autovacuum infinite retry. Attached a draft patch for the option (2). Thoughts?
>
I'm not sure. The default retry interval is 1 minute, so it won't
exactly be a flood of messages. Also, if the error only occurs for a
small subset of rows, it's possible that retrying might succeed.
Regards,
Dean
^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: Infinite Autovacuum loop caused by failing virtual generated column expression
@ 2026-04-14 06:24 Yugo Nagata <[email protected]>
parent: Dean Rasheed <[email protected]>
0 siblings, 1 reply; 7+ messages in thread
From: Yugo Nagata @ 2026-04-14 06:24 UTC (permalink / raw)
To: Dean Rasheed <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sat, 11 Apr 2026 17:33:13 +0100
Dean Rasheed <[email protected]> wrote:
> On Fri, 10 Apr 2026 at 21:19, SATYANARAYANA NARLAPURAM
> <[email protected]> wrote:
> >
> > PG19 added support for stats on virtual generated columns [1]. Creating extended statistics on a virtual generated column whose expression can raise an error leads to ANALYZE failing repeatedly, and autovacuum retrying indefinitely. This floods the server logs and also wastes resources. Vacuum analyze on that column (without extended stats) succeeds.
> >
>
> True, though this is nothing new. The same thing can happen with
> expression statistics on an expression that raises an error, which has
> been possible since PG14.
Yes, this issue is not new, and I’m not aware of a way to prevent it a priori.
>
> > In order to avoid retry storms, I think we have two options. (1) skipping the offending row from the sample, (2) skipping the extended stats computation for that table with a warning message. At least this avoid autovacuum infinite retry. Attached a draft patch for the option (2). Thoughts?
> >
>
> I'm not sure. The default retry interval is 1 minute, so it won't
> exactly be a flood of messages. Also, if the error only occurs for a
> small subset of rows, it's possible that retrying might succeed.
I think it would be good to skip ANALYZE for the extended statistics that cause
errors and just emit a warning, rather than aborting ANALYZE for the entire table.
It seems reasonable to treat this as the user’s responsibility to notice the warning
and address the underlying issue.
Regards,
Yugo Nagata
--
Yugo Nagata <[email protected]>
^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: Infinite Autovacuum loop caused by failing virtual generated column expression
@ 2026-04-14 07:16 SATYANARAYANA NARLAPURAM <[email protected]>
parent: Yugo Nagata <[email protected]>
0 siblings, 1 reply; 7+ messages in thread
From: SATYANARAYANA NARLAPURAM @ 2026-04-14 07:16 UTC (permalink / raw)
To: Yugo Nagata <[email protected]>; +Cc: Dean Rasheed <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi
On Mon, Apr 13, 2026 at 11:24 PM Yugo Nagata <[email protected]> wrote:
> On Sat, 11 Apr 2026 17:33:13 +0100
> Dean Rasheed <[email protected]> wrote:
>
> > On Fri, 10 Apr 2026 at 21:19, SATYANARAYANA NARLAPURAM
> > <[email protected]> wrote:
> > >
> > > PG19 added support for stats on virtual generated columns [1].
> Creating extended statistics on a virtual generated column whose expression
> can raise an error leads to ANALYZE failing repeatedly, and autovacuum
> retrying indefinitely. This floods the server logs and also wastes
> resources. Vacuum analyze on that column (without extended stats) succeeds.
> > >
> >
> > True, though this is nothing new. The same thing can happen with
> > expression statistics on an expression that raises an error, which has
> > been possible since PG14.
>
> Yes, this issue is not new, and I’m not aware of a way to prevent it a
> priori.
>
> >
> > > In order to avoid retry storms, I think we have two options. (1)
> skipping the offending row from the sample, (2) skipping the extended stats
> computation for that table with a warning message. At least this avoid
> autovacuum infinite retry. Attached a draft patch for the option (2).
> Thoughts?
> > >
> >
> > I'm not sure. The default retry interval is 1 minute, so it won't
> > exactly be a flood of messages. Also, if the error only occurs for a
> > small subset of rows, it's possible that retrying might succeed.
>
> I think it would be good to skip ANALYZE for the extended statistics that
> cause
> errors and just emit a warning, rather than aborting ANALYZE for the
> entire table.
> It seems reasonable to treat this as the user’s responsibility to notice
> the warning
> and address the underlying issue.
>
Yugo, thanks for the comments. Could you please review the v1 patch when you
get a chance. It is in the direction you suggested.
Thanks,
Satya
^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: Infinite Autovacuum loop caused by failing virtual generated column expression
@ 2026-04-28 09:14 Yugo Nagata <[email protected]>
parent: SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 1 reply; 7+ messages in thread
From: Yugo Nagata @ 2026-04-28 09:14 UTC (permalink / raw)
To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: Dean Rasheed <[email protected]>; PostgreSQL Hackers <[email protected]>
On Tue, 14 Apr 2026 00:16:42 -0700
SATYANARAYANA NARLAPURAM <[email protected]> wrote:
> Hi
>
> On Mon, Apr 13, 2026 at 11:24 PM Yugo Nagata <[email protected]> wrote:
>
> > On Sat, 11 Apr 2026 17:33:13 +0100
> > Dean Rasheed <[email protected]> wrote:
> >
> > > On Fri, 10 Apr 2026 at 21:19, SATYANARAYANA NARLAPURAM
> > > <[email protected]> wrote:
> > > >
> > > > PG19 added support for stats on virtual generated columns [1].
> > Creating extended statistics on a virtual generated column whose expression
> > can raise an error leads to ANALYZE failing repeatedly, and autovacuum
> > retrying indefinitely. This floods the server logs and also wastes
> > resources. Vacuum analyze on that column (without extended stats) succeeds.
> > > >
> > >
> > > True, though this is nothing new. The same thing can happen with
> > > expression statistics on an expression that raises an error, which has
> > > been possible since PG14.
> >
> > Yes, this issue is not new, and I’m not aware of a way to prevent it a
> > priori.
> >
> > >
> > > > In order to avoid retry storms, I think we have two options. (1)
> > skipping the offending row from the sample, (2) skipping the extended stats
> > computation for that table with a warning message. At least this avoid
> > autovacuum infinite retry. Attached a draft patch for the option (2).
> > Thoughts?
> > > >
> > >
> > > I'm not sure. The default retry interval is 1 minute, so it won't
> > > exactly be a flood of messages. Also, if the error only occurs for a
> > > small subset of rows, it's possible that retrying might succeed.
> >
> > I think it would be good to skip ANALYZE for the extended statistics that
> > cause
> > errors and just emit a warning, rather than aborting ANALYZE for the
> > entire table.
> > It seems reasonable to treat this as the user’s responsibility to notice
> > the warning
> > and address the underlying issue.
> >
>
> Yugo, thanks for the comments. Could you please review the v1 patch when you
> get a chance. It is in the direction you suggested.
I've looked into the patch and have some comments.
The child ResourceOwner is created and released in BuildRelationExtStatistics(),
but I don't think it is necessary if we add other PG_TRY block in make_build_data()
and compute_expr_stats(). For example in make_build_data():
+ PG_TRY();
+ {
+ datum = ExecEvalExpr(exprstate,
+ GetPerTupleExprContext(estate),
+ &isnull);
+ }
+ PG_CATCH();
+ {
+ ExecDropSingleTupleTableSlot(slot);
+ FreeExecutorState(estate);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
Also, we could add tests for extended statistics that do not involve virtual generated
columns, since those are not the cause root of the issue. In addition, it might be useful
to verify that non-skipped extended statistics are still computed successfully.
For example:
+CREATE TABLE expr_err (a int);
+INSERT INTO expr_err VALUES (1), (2), (3);
+CREATE STATISTICS expr_err_s1 ON ((a/0)) FROM expr_err;
+CREATE STATISTICS expr_err_s2 ON (a/0),(a+1) FROM expr_err;
+CREATE STATISTICS expr_err_s3 ON ((a+1)) FROM expr_err;
+ANALYZE expr_err; -- should warn, not fail
+SELECT statistics_name from pg_stats_ext x
+ WHERE tablename = 'expr_err' ORDER BY ROW(x.*);
Regards,
Yugo Nagata
--
Yugo Nagata <[email protected]>
^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: Infinite Autovacuum loop caused by failing virtual generated column expression
@ 2026-05-03 18:04 SATYANARAYANA NARLAPURAM <[email protected]>
parent: Yugo Nagata <[email protected]>
0 siblings, 1 reply; 7+ messages in thread
From: SATYANARAYANA NARLAPURAM @ 2026-05-03 18:04 UTC (permalink / raw)
To: Yugo Nagata <[email protected]>; +Cc: Dean Rasheed <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi,
On Tue, Apr 28, 2026 at 2:14 AM Yugo Nagata <[email protected]> wrote:
> On Tue, 14 Apr 2026 00:16:42 -0700
> SATYANARAYANA NARLAPURAM <[email protected]> wrote:
>
> > Hi
> >
> > On Mon, Apr 13, 2026 at 11:24 PM Yugo Nagata <[email protected]>
> wrote:
> >
> > > On Sat, 11 Apr 2026 17:33:13 +0100
> > > Dean Rasheed <[email protected]> wrote:
> > >
> > > > On Fri, 10 Apr 2026 at 21:19, SATYANARAYANA NARLAPURAM
> > > > <[email protected]> wrote:
> > > > >
> > > > > PG19 added support for stats on virtual generated columns [1].
> > > Creating extended statistics on a virtual generated column whose
> expression
> > > can raise an error leads to ANALYZE failing repeatedly, and autovacuum
> > > retrying indefinitely. This floods the server logs and also wastes
> > > resources. Vacuum analyze on that column (without extended stats)
> succeeds.
> > > > >
> > > >
> > > > True, though this is nothing new. The same thing can happen with
> > > > expression statistics on an expression that raises an error, which
> has
> > > > been possible since PG14.
> > >
> > > Yes, this issue is not new, and I’m not aware of a way to prevent it a
> > > priori.
> > >
> > > >
> > > > > In order to avoid retry storms, I think we have two options. (1)
> > > skipping the offending row from the sample, (2) skipping the extended
> stats
> > > computation for that table with a warning message. At least this avoid
> > > autovacuum infinite retry. Attached a draft patch for the option (2).
> > > Thoughts?
> > > > >
> > > >
> > > > I'm not sure. The default retry interval is 1 minute, so it won't
> > > > exactly be a flood of messages. Also, if the error only occurs for a
> > > > small subset of rows, it's possible that retrying might succeed.
> > >
> > > I think it would be good to skip ANALYZE for the extended statistics
> that
> > > cause
> > > errors and just emit a warning, rather than aborting ANALYZE for the
> > > entire table.
> > > It seems reasonable to treat this as the user’s responsibility to
> notice
> > > the warning
> > > and address the underlying issue.
> > >
> >
> > Yugo, thanks for the comments. Could you please review the v1 patch when
> you
> > get a chance. It is in the direction you suggested.
>
> I've looked into the patch and have some comments.
>
> The child ResourceOwner is created and released in
> BuildRelationExtStatistics(),
> but I don't think it is necessary if we add other PG_TRY block in
> make_build_data()
> and compute_expr_stats(). For example in make_build_data():
>
> + PG_TRY();
> + {
> + datum = ExecEvalExpr(exprstate,
> +
> GetPerTupleExprContext(estate),
> +
> &isnull);
> + }
> + PG_CATCH();
> + {
> + ExecDropSingleTupleTableSlot(slot);
> + FreeExecutorState(estate);
> + PG_RE_THROW();
> + }
> + PG_END_TRY();
>
Thanks, for reviewing the patch. Agreed, please find the updated patch.
> Also, we could add tests for extended statistics that do not involve
> virtual generated
> columns, since those are not the cause root of the issue. In addition, it
> might be useful
> to verify that non-skipped extended statistics are still computed
> successfully.
> For example:
>
> +CREATE TABLE expr_err (a int);
> +INSERT INTO expr_err VALUES (1), (2), (3);
> +CREATE STATISTICS expr_err_s1 ON ((a/0)) FROM expr_err;
> +CREATE STATISTICS expr_err_s2 ON (a/0),(a+1) FROM expr_err;
> +CREATE STATISTICS expr_err_s3 ON ((a+1)) FROM expr_err;
> +ANALYZE expr_err; -- should warn, not fail
> +SELECT statistics_name from pg_stats_ext x
> + WHERE tablename = 'expr_err' ORDER BY ROW(x.*);
>
Added these tests as well in the v2 patch.
Thanks,
Satya
Attachments:
[application/octet-stream] v2-0001-Fix-ANALYZE-crash-on-extended-stats-with-virtual-gen.patch (13.5K, 3-v2-0001-Fix-ANALYZE-crash-on-extended-stats-with-virtual-gen.patch)
download | inline diff:
From 6a2705d2084abd9b6b304c460cf17a4c96c1fdad Mon Sep 17 00:00:00 2001
From: Satya Narlapuram <[email protected]>
Date: Sun, 3 May 2026 17:37:24 +0000
Subject: [PATCH] Fix ANALYZE crash on extended stats with virtual generated
columns
When a table has extended statistics involving expressions on virtual
generated columns, ANALYZE can fail if evaluating those expressions
raises an error (e.g. division by zero). This happens because the
generated column expression is evaluated directly during statistics
computation.
Fix by wrapping expression evaluation in PG_TRY/PG_CATCH blocks:
- In make_build_data() and compute_expr_stats(), add PG_TRY blocks
that clean up executor resources (slot, estate) before re-throwing,
preventing resource leaks on error.
- In BuildRelationExtStatistics(), catch errors from expression
evaluation, issue a WARNING with the statistics object name and
error details, then continue processing the remaining statistics
objects.
This ensures ANALYZE completes successfully even when some extended
statistics objects cannot be computed due to expression evaluation
errors.
---
src/backend/statistics/extended_stats.c | 238 +++++++++++++++---------
src/test/regress/expected/stats_ext.out | 29 +++
src/test/regress/sql/stats_ext.sql | 20 ++
3 files changed, 195 insertions(+), 92 deletions(-)
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 2b83355d..88cea81b 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -194,41 +194,73 @@ BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
if (stattarget == 0)
continue;
- /* evaluate expressions (if the statistics object has any) */
- data = make_build_data(onerel, stat, numrows, rows, stats, stattarget);
-
- /* compute statistic of each requested type */
- foreach(lc2, stat->types)
+ /*
+ * Wrap expression evaluation and stats computation in PG_TRY so
+ * that errors from evaluating expressions (e.g. division by zero
+ * in virtual generated columns) don't cause ANALYZE to fail
+ * entirely. Skip the statistics object and issue a WARNING
+ * instead.
+ */
+ PG_TRY();
{
- char t = (char) lfirst_int(lc2);
-
- if (t == STATS_EXT_NDISTINCT)
- ndistinct = statext_ndistinct_build(totalrows, data);
- else if (t == STATS_EXT_DEPENDENCIES)
- dependencies = statext_dependencies_build(data);
- else if (t == STATS_EXT_MCV)
- mcv = statext_mcv_build(data, totalrows, stattarget);
- else if (t == STATS_EXT_EXPRESSIONS)
+ /* evaluate expressions (if the statistics has any) */
+ data = make_build_data(onerel, stat, numrows, rows,
+ stats, stattarget);
+
+ /* compute statistic of each requested type */
+ foreach(lc2, stat->types)
{
- AnlExprData *exprdata;
- int nexprs;
+ char t = (char) lfirst_int(lc2);
+
+ if (t == STATS_EXT_NDISTINCT)
+ ndistinct = statext_ndistinct_build(totalrows, data);
+ else if (t == STATS_EXT_DEPENDENCIES)
+ dependencies = statext_dependencies_build(data);
+ else if (t == STATS_EXT_MCV)
+ mcv = statext_mcv_build(data, totalrows, stattarget);
+ else if (t == STATS_EXT_EXPRESSIONS)
+ {
+ AnlExprData *exprdata;
+ int nexprs;
- /* should not happen, thanks to checks when defining stats */
- if (!stat->exprs)
- elog(ERROR, "requested expression stats, but there are no expressions");
+ /* should not happen, thanks to checks when defining stats */
+ if (!stat->exprs)
+ elog(ERROR, "requested expression stats, but there are no expressions");
- exprdata = build_expr_data(stat->exprs, stattarget);
- nexprs = list_length(stat->exprs);
+ exprdata = build_expr_data(stat->exprs, stattarget);
+ nexprs = list_length(stat->exprs);
- compute_expr_stats(onerel, exprdata, nexprs, rows, numrows);
+ compute_expr_stats(onerel, exprdata, nexprs,
+ rows, numrows);
- exprstats = serialize_expr_stats(exprdata, nexprs);
+ exprstats = serialize_expr_stats(exprdata, nexprs);
+ }
}
- }
- /* store the statistics in the catalog */
- statext_store(stat->statOid, inh,
- ndistinct, dependencies, mcv, exprstats, stats);
+ /* store the statistics in the catalog */
+ statext_store(stat->statOid, inh,
+ ndistinct, dependencies, mcv, exprstats, stats);
+ }
+ PG_CATCH();
+ {
+ ErrorData *edata;
+
+ /* Save the error, issue a WARNING and continue */
+ MemoryContextSwitchTo(cxt);
+ edata = CopyErrorData();
+ FlushErrorState();
+
+ ereport(WARNING,
+ (errcode(ERRCODE_WARNING),
+ errmsg("skipping statistics object \"%s.%s\" for relation \"%s.%s\"",
+ stat->schema, stat->name,
+ get_namespace_name(onerel->rd_rel->relnamespace),
+ RelationGetRelationName(onerel)),
+ errdetail("%s", edata->message)));
+
+ FreeErrorData(edata);
+ }
+ PG_END_TRY();
/* for reporting progress */
pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -2185,46 +2217,58 @@ compute_expr_stats(Relation onerel, AnlExprData *exprdata, int nexprs,
exprnulls = (bool *) palloc(numrows * sizeof(bool));
tcnt = 0;
- for (i = 0; i < numrows; i++)
+ PG_TRY();
{
- Datum datum;
- bool isnull;
+ for (i = 0; i < numrows; i++)
+ {
+ Datum datum;
+ bool isnull;
- /*
- * Reset the per-tuple context each time, to reclaim any cruft
- * left behind by evaluating the statistics expressions.
- */
- ResetExprContext(econtext);
+ /*
+ * Reset the per-tuple context each time, to reclaim any
+ * cruft left behind by evaluating the statistics
+ * expressions.
+ */
+ ResetExprContext(econtext);
- /* Set up for expression evaluation */
- ExecStoreHeapTuple(rows[i], slot, false);
+ /* Set up for expression evaluation */
+ ExecStoreHeapTuple(rows[i], slot, false);
- /*
- * Evaluate the expression. We do this in the per-tuple context so
- * as not to leak memory, and then copy the result into the
- * context created at the beginning of this function.
- */
- datum = ExecEvalExprSwitchContext(exprstate,
- GetPerTupleExprContext(estate),
- &isnull);
- if (isnull)
- {
- exprvals[tcnt] = (Datum) 0;
- exprnulls[tcnt] = true;
- }
- else
- {
- /* Make sure we copy the data into the context. */
- Assert(CurrentMemoryContext == expr_context);
+ /*
+ * Evaluate the expression. We do this in the per-tuple
+ * context so as not to leak memory, and then copy the
+ * result into the context created at the beginning of
+ * this function.
+ */
+ datum = ExecEvalExprSwitchContext(exprstate,
+ GetPerTupleExprContext(estate),
+ &isnull);
+ if (isnull)
+ {
+ exprvals[tcnt] = (Datum) 0;
+ exprnulls[tcnt] = true;
+ }
+ else
+ {
+ /* Make sure we copy the data into the context. */
+ Assert(CurrentMemoryContext == expr_context);
- exprvals[tcnt] = datumCopy(datum,
- stats->attrtype->typbyval,
- stats->attrtype->typlen);
- exprnulls[tcnt] = false;
- }
+ exprvals[tcnt] = datumCopy(datum,
+ stats->attrtype->typbyval,
+ stats->attrtype->typlen);
+ exprnulls[tcnt] = false;
+ }
- tcnt++;
+ tcnt++;
+ }
+ }
+ PG_CATCH();
+ {
+ ExecDropSingleTupleTableSlot(slot);
+ FreeExecutorState(estate);
+ PG_RE_THROW();
}
+ PG_END_TRY();
/*
* Now we can compute the statistics for the expression columns.
@@ -2624,46 +2668,56 @@ make_build_data(Relation rel, StatExtEntry *stat, int numrows, HeapTuple *rows,
/* Set up expression evaluation state */
exprstates = ExecPrepareExprList(stat->exprs, estate);
- for (i = 0; i < numrows; i++)
+ PG_TRY();
{
- /*
- * Reset the per-tuple context each time, to reclaim any cruft left
- * behind by evaluating the statistics object expressions.
- */
- ResetExprContext(econtext);
-
- /* Set up for expression evaluation */
- ExecStoreHeapTuple(rows[i], slot, false);
-
- idx = bms_num_members(stat->columns);
- foreach(lc, exprstates)
+ for (i = 0; i < numrows; i++)
{
- Datum datum;
- bool isnull;
- ExprState *exprstate = (ExprState *) lfirst(lc);
-
/*
- * XXX This probably leaks memory. Maybe we should use
- * ExecEvalExprSwitchContext but then we need to copy the result
- * somewhere else.
+ * Reset the per-tuple context each time, to reclaim any cruft
+ * left behind by evaluating the statistics object expressions.
*/
- datum = ExecEvalExpr(exprstate,
- GetPerTupleExprContext(estate),
- &isnull);
- if (isnull)
- {
- result->values[idx][i] = (Datum) 0;
- result->nulls[idx][i] = true;
- }
- else
+ ResetExprContext(econtext);
+
+ /* Set up for expression evaluation */
+ ExecStoreHeapTuple(rows[i], slot, false);
+
+ idx = bms_num_members(stat->columns);
+ foreach(lc, exprstates)
{
- result->values[idx][i] = datum;
- result->nulls[idx][i] = false;
- }
+ Datum datum;
+ bool isnull;
+ ExprState *exprstate = (ExprState *) lfirst(lc);
- idx++;
+ /*
+ * XXX This probably leaks memory. Maybe we should use
+ * ExecEvalExprSwitchContext but then we need to copy the
+ * result somewhere else.
+ */
+ datum = ExecEvalExpr(exprstate,
+ GetPerTupleExprContext(estate),
+ &isnull);
+ if (isnull)
+ {
+ result->values[idx][i] = (Datum) 0;
+ result->nulls[idx][i] = true;
+ }
+ else
+ {
+ result->values[idx][i] = datum;
+ result->nulls[idx][i] = false;
+ }
+
+ idx++;
+ }
}
}
+ PG_CATCH();
+ {
+ ExecDropSingleTupleTableSlot(slot);
+ FreeExecutorState(estate);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
ExecDropSingleTupleTableSlot(slot);
FreeExecutorState(estate);
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 37070c1a..43892f4f 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -3207,6 +3207,14 @@ SELECT * FROM check_estimated_rows('SELECT * FROM virtual_gen_stats WHERE w = 0'
(1 row)
DROP TABLE virtual_gen_stats;
+-- extended statistics on virtual generated columns whose expressions can error
+CREATE TABLE virtual_gen_err (a int, b int GENERATED ALWAYS AS (a / 0) VIRTUAL);
+INSERT INTO virtual_gen_err VALUES (1), (2), (3);
+CREATE STATISTICS virtual_gen_err_s ON a, b FROM virtual_gen_err;
+ANALYZE virtual_gen_err; -- should warn, not fail
+WARNING: skipping statistics object "public.virtual_gen_err_s" for relation "public.virtual_gen_err"
+DETAIL: division by zero
+DROP TABLE virtual_gen_err;
-- Permission tests. Users should not be able to see specific data values in
-- the extended statistics, if they lack permission to see those values in
-- the underlying table.
@@ -3709,3 +3717,24 @@ SELECT range_length_histogram, range_empty_frac, range_bounds_histogram
(1 row)
DROP TABLE stats_ext_tbl_range;
+-- Expression statistics with errors (not involving virtual generated columns).
+-- Errors during expression evaluation should produce a WARNING, not a failure,
+-- and non-erroring statistics objects should still be computed.
+CREATE TABLE expr_err (a int);
+INSERT INTO expr_err VALUES (1), (2), (3);
+CREATE STATISTICS expr_err_s1 ON ((a/0)) FROM expr_err;
+CREATE STATISTICS expr_err_s2 ON (a/0),(a+1) FROM expr_err;
+CREATE STATISTICS expr_err_s3 ON ((a+1)) FROM expr_err;
+ANALYZE expr_err; -- should warn, not fail
+WARNING: skipping statistics object "public.expr_err_s1" for relation "public.expr_err"
+DETAIL: division by zero
+WARNING: skipping statistics object "public.expr_err_s2" for relation "public.expr_err"
+DETAIL: division by zero
+SELECT statistics_name from pg_stats_ext x
+ WHERE tablename = 'expr_err' ORDER BY ROW(x.*);
+ statistics_name
+-----------------
+ expr_err_s3
+(1 row)
+
+DROP TABLE expr_err;
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 3cc6012b..f78fd902 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -1608,6 +1608,13 @@ SELECT * FROM check_estimated_rows('SELECT * FROM virtual_gen_stats WHERE w = 0'
DROP TABLE virtual_gen_stats;
+-- extended statistics on virtual generated columns whose expressions can error
+CREATE TABLE virtual_gen_err (a int, b int GENERATED ALWAYS AS (a / 0) VIRTUAL);
+INSERT INTO virtual_gen_err VALUES (1), (2), (3);
+CREATE STATISTICS virtual_gen_err_s ON a, b FROM virtual_gen_err;
+ANALYZE virtual_gen_err; -- should warn, not fail
+DROP TABLE virtual_gen_err;
+
-- Permission tests. Users should not be able to see specific data values in
-- the extended statistics, if they lack permission to see those values in
-- the underlying table.
@@ -1908,3 +1915,16 @@ SELECT range_length_histogram, range_empty_frac, range_bounds_histogram
FROM pg_stats_ext_exprs
WHERE statistics_name = 'stats_ext_range';
DROP TABLE stats_ext_tbl_range;
+
+-- Expression statistics with errors (not involving virtual generated columns).
+-- Errors during expression evaluation should produce a WARNING, not a failure,
+-- and non-erroring statistics objects should still be computed.
+CREATE TABLE expr_err (a int);
+INSERT INTO expr_err VALUES (1), (2), (3);
+CREATE STATISTICS expr_err_s1 ON ((a/0)) FROM expr_err;
+CREATE STATISTICS expr_err_s2 ON (a/0),(a+1) FROM expr_err;
+CREATE STATISTICS expr_err_s3 ON ((a+1)) FROM expr_err;
+ANALYZE expr_err; -- should warn, not fail
+SELECT statistics_name from pg_stats_ext x
+ WHERE tablename = 'expr_err' ORDER BY ROW(x.*);
+DROP TABLE expr_err;
--
2.43.0
^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: Infinite Autovacuum loop caused by failing virtual generated column expression
@ 2026-05-12 10:30 Yugo Nagata <[email protected]>
parent: SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 0 replies; 7+ messages in thread
From: Yugo Nagata @ 2026-05-12 10:30 UTC (permalink / raw)
To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: Dean Rasheed <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sun, 3 May 2026 11:04:59 -0700
SATYANARAYANA NARLAPURAM <[email protected]> wrote:
Thank you for updating the patch!
There are a few comments on v2 patch.
In compute_expr_stats(),
+ PG_CATCH();
+ {
+ ExecDropSingleTupleTableSlot(slot);
+ FreeExecutorState(estate);
+ PG_RE_THROW();
}
Should we switch the context to expr_context before releasing
slot and estate? Also, should we call MemoryContextDelete(expr_context)
to release palloc'ed memory in the loop?
(In fact, the error would not be cached in compute_expr_stats(), though,
since it would be cached in make_build_data() if any.)
Should we switch to expr_context before releasing slot and estate?
Also, should we call MemoryContextDelete(expr_context) to release
memory allocated by palloc() in the loop?
(In practice, errors would not catched in compute_expr_stats() itself,
since they would already be caught in make_build_data() if any.)
+ /*
+ * Wrap expression evaluation and stats computation in PG_TRY so
+ * that errors from evaluating expressions (e.g. division by zero
+ * in virtual generated columns) don't cause ANALYZE to fail
+ * entirely. Skip the statistics object and issue a WARNING
+ * instead.
How about rewriting the comments to reflect the more general case?
Extended statistics involving virtual generated columns are a somewhat
special case, while errors in expression statistics seem more common
in practice. The same applies to the commit message as well.
Regards,
Yugo Nagata
--
Yugo Nagata <[email protected]>
^ permalink raw reply [nested|flat] 7+ messages in thread
end of thread, other threads:[~2026-05-12 10:30 UTC | newest]
Thread overview: 7+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-04-10 20:18 Infinite Autovacuum loop caused by failing virtual generated column expression SATYANARAYANA NARLAPURAM <[email protected]>
2026-04-11 16:33 ` Dean Rasheed <[email protected]>
2026-04-14 06:24 ` Yugo Nagata <[email protected]>
2026-04-14 07:16 ` SATYANARAYANA NARLAPURAM <[email protected]>
2026-04-28 09:14 ` Yugo Nagata <[email protected]>
2026-05-03 18:04 ` SATYANARAYANA NARLAPURAM <[email protected]>
2026-05-12 10:30 ` Yugo Nagata <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox