public inbox for [email protected]  
help / color / mirror / Atom feed
From: Chao Li <[email protected]>
To: PostgreSQL-development <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Corey Huinker <[email protected]>
Subject: Fix small issues of pg_restore_extended_stats()
Date: Mon, 18 May 2026 10:25:16 +0800
Message-ID: <[email protected]> (raw)

Hi,

While testing pg_restore_extended_stats(), I noticed several small issues.

1. Inconsistent expression key verification behavior

In this example, “b” is an invalid key, so it raises a warning and rejects the input:
```
evantest=# select pg_restore_extended_stats(
    'schemaname', 'repro’,
    'relname', 't’,
    'statistics_schemaname', 'repro’,
    'statistics_name', 's’,
    'inherited', false, 
    'exprs', '[{"b": "1"}]'::jsonb);
WARNING:  could not import element in expression -1: invalid key name
 pg_restore_extended_stats
---------------------------
 f
(1 row)
```

However, when I tried “a”, which is also an invalid key, it is silently ignored:
```
evantest=# select pg_restore_extended_stats(
    'schemaname', 'repro’,
    'relname', 't’,
    'statistics_schemaname', 'repro’,
    'statistics_name', 's’,
    'inherited', false,
    'exprs', '[{"a": "1"}]'::jsonb);
 pg_restore_extended_stats
---------------------------
 t
(1 row)
```

After debugging, I think the problem is here:
```
/*
 * Check if key is found in the list of expression argnames.
 */
static bool
key_in_expr_argnames(JsonbValue *key)
{
	Assert(key->type == jbvString);
	for (int i = 0; i < NUM_ATTRIBUTE_STATS_ELEMS; i++)
	{
		if (strncmp(extexprargname[i], key->val.string.val, key->val.string.len) == 0)
			return true;
	}
	return false;
}
```

This function is actually doing a prefix comparison using the input key length, so as long as an input key matches a prefix of a valid key name, the function returns true.

At runtime, it does not lead to an incorrect value being imported, because the invalid key will still be filtered out later. But one bad scenario I can imagine is that a user is trying to set “correlation”, and accidentally omits the last “n”. In that case, the key is silently discarded and the user is not aware of it, which can lead to a surprising result. For example:
```
evantest=# select pg_restore_extended_stats(
    'schemaname', 'repro’,
    'relname', 't’,
    'statistics_schemaname', 'repro’,
    'statistics_name', 's’,
    'inherited', false,
    'exprs', '[{"n_distinct": "5", "correlatio": "-0.6"}]'::jsonb);
 pg_restore_extended_stats
---------------------------
 t
(1 row)

evantest=# select n_distinct, correlation from pg_stats_ext_exprs where statistics_schemaname = 'repro' and statistics_name = 's';
 n_distinct | correlation
------------+-------------
          5 |
(1 row)
```

Here, the user may think correlation has been set, but it has not, and there is no warning. I think this behavior should be fixed.

The fix is straightforward. If we keep using strncmp() for safety, we should also compare the lengths of both strings.

2. Wrong number in a warning message

I have only one expression in this example:
```
evantest=# select pg_restore_extended_stats(
evantest(#   'schemaname', 'repro',
evantest(#   'relname', 't',
evantest(#   'statistics_schemaname', 'repro',
evantest(#   'statistics_name', 's',
evantest(#   'inherited', false,
evantest(#   'exprs', '[{"n_distinct": "1"}, {"n_distinct": "2"}, {"n_distinct": "3"}]'::jsonb
evantest(# );
WARNING:  could not parse "exprs": incorrect number of elements (3 required)
 pg_restore_extended_stats
---------------------------
 f
(1 row)
```

The warning message says “3 required”, but it should be “1 required”.

The root cause is here:
```
	num_root_elements = JsonContainerSize(root);
	if (numexprs != num_root_elements)
	{
		ereport(WARNING,
				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
				errmsg("could not parse \"%s\": incorrect number of elements (%d required)",
					   argname, num_root_elements));
		goto exprs_error;
	}
```

In the ereport, we should use numexprs instead.

3. Inconsistent heap_freetuple()

In pg_clear_extended_stats(), heap_freetuple(tup) is only called before the final return. However, in two earlier paths, the function only emits warning messages and returns. It seems those paths should also call heap_freetuple(tup).

But from my previous experience, I know this kind of issue is usually not considered as serious, so I only point it out here without making a fix for now.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/






Attachments:

  [application/octet-stream] v1-0001-Fix-prefix-matching-of-expression-stats-keys.patch (3.6K, 2-v1-0001-Fix-prefix-matching-of-expression-stats-keys.patch)
  download | inline diff:
From 4573034d05961da16d9f4deb334bea39d620b5e9 Mon Sep 17 00:00:00 2001
From: "Chao Li (Evan)" <[email protected]>
Date: Mon, 18 May 2026 09:17:55 +0800
Subject: [PATCH v1 1/2] Fix prefix matching of expression stats keys

When validating keys in imported expression statistics, the code used
strncmp() with the input key length. This allowed an invalid key that
was only a prefix of a valid key name to be accepted.

Fix this by also checking that the key lengths match before comparing
the key contents. Add a regression test to cover the prefix-key case.

Author: Chao Li <[email protected]>
---
 src/backend/statistics/extended_stats_funcs.c |  3 ++-
 src/test/regress/expected/stats_import.out    | 14 ++++++++++++++
 src/test/regress/sql/stats_import.sql         |  9 +++++++++
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/src/backend/statistics/extended_stats_funcs.c b/src/backend/statistics/extended_stats_funcs.c
index 70393d3a904..15f131724e9 100644
--- a/src/backend/statistics/extended_stats_funcs.c
+++ b/src/backend/statistics/extended_stats_funcs.c
@@ -886,7 +886,8 @@ key_in_expr_argnames(JsonbValue *key)
 	Assert(key->type == jbvString);
 	for (int i = 0; i < NUM_ATTRIBUTE_STATS_ELEMS; i++)
 	{
-		if (strncmp(extexprargname[i], key->val.string.val, key->val.string.len) == 0)
+		if (strlen(extexprargname[i]) == key->val.string.len &&
+			strncmp(extexprargname[i], key->val.string.val, key->val.string.len) == 0)
 			return true;
 	}
 	return false;
diff --git a/src/test/regress/expected/stats_import.out b/src/test/regress/expected/stats_import.out
index f421e83e232..d73bc96039f 100644
--- a/src/test/regress/expected/stats_import.out
+++ b/src/test/regress/expected/stats_import.out
@@ -3256,6 +3256,20 @@ most_common_elems      | {-1,0,1,2,3}
 most_common_elem_freqs | {0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}
 elem_count_histogram   | {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}
 
+-- bad: exprs param which is a prefix of a valid key name
+SELECT pg_catalog.pg_restore_extended_stats(
+  'schemaname', 'stats_import',
+  'relname', 'test',
+  'statistics_schemaname', 'stats_import',
+  'statistics_name', 'test_stat_mcelem',
+  'inherited', false,
+  'exprs', '[{ "n": "-1" }]'::jsonb);
+WARNING:  could not import element in expression -1: invalid key name
+ pg_restore_extended_stats 
+---------------------------
+ f
+(1 row)
+
 -- ok: tsvector exceptions, test just the collation exceptions
 CREATE STATISTICS stats_import.test_stat_tsvec ON (length(name)), (to_tsvector(name)) FROM stats_import.test;
 SELECT pg_catalog.pg_restore_extended_stats(
diff --git a/src/test/regress/sql/stats_import.sql b/src/test/regress/sql/stats_import.sql
index c1bf55690a6..b74f6395a5e 100644
--- a/src/test/regress/sql/stats_import.sql
+++ b/src/test/regress/sql/stats_import.sql
@@ -2244,6 +2244,15 @@ WHERE e.statistics_schemaname = 'stats_import' AND
     e.inherited = false
 \gx
 
+-- bad: exprs param which is a prefix of a valid key name
+SELECT pg_catalog.pg_restore_extended_stats(
+  'schemaname', 'stats_import',
+  'relname', 'test',
+  'statistics_schemaname', 'stats_import',
+  'statistics_name', 'test_stat_mcelem',
+  'inherited', false,
+  'exprs', '[{ "n": "-1" }]'::jsonb);
+
 -- ok: tsvector exceptions, test just the collation exceptions
 CREATE STATISTICS stats_import.test_stat_tsvec ON (length(name)), (to_tsvector(name)) FROM stats_import.test;
 SELECT pg_catalog.pg_restore_extended_stats(
-- 
2.50.1 (Apple Git-155)



  [application/octet-stream] v1-0002-Fix-required-expression-count-in-stats-import-war.patch (4.0K, 3-v1-0002-Fix-required-expression-count-in-stats-import-war.patch)
  download | inline diff:
From 961bb9aa5139b303a88b195a3994e0cf7b7d5e63 Mon Sep 17 00:00:00 2001
From: "Chao Li (Evan)" <[email protected]>
Date: Mon, 18 May 2026 10:03:24 +0800
Subject: [PATCH v1 2/2] Fix required expression count in stats import warning

When importing expression statistics, the warning for an incorrect
number of expression elements reported the number of elements found in
the input as the required count. This made the message misleading,
especially when the input contained too few elements.

Report the expected number of expressions instead. Also extend the
regression test to cover both too-few and too-many expression elements.

Author: Chao Li <[email protected]>
---
 src/backend/statistics/extended_stats_funcs.c |  2 +-
 src/test/regress/expected/stats_import.out    | 18 ++++++++++++++++--
 src/test/regress/sql/stats_import.sql         | 10 +++++++++-
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/src/backend/statistics/extended_stats_funcs.c b/src/backend/statistics/extended_stats_funcs.c
index 15f131724e9..8fc9c4c6f45 100644
--- a/src/backend/statistics/extended_stats_funcs.c
+++ b/src/backend/statistics/extended_stats_funcs.c
@@ -1593,7 +1593,7 @@ import_expressions(Relation pgsd, int numexprs,
 		ereport(WARNING,
 				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				errmsg("could not parse \"%s\": incorrect number of elements (%d required)",
-					   argname, num_root_elements));
+					   argname, numexprs));
 		goto exprs_error;
 	}
 
diff --git a/src/test/regress/expected/stats_import.out b/src/test/regress/expected/stats_import.out
index d73bc96039f..183e68d65c2 100644
--- a/src/test/regress/expected/stats_import.out
+++ b/src/test/regress/expected/stats_import.out
@@ -2455,7 +2455,7 @@ WARNING:  could not parse "exprs": root-level array required
  f
 (1 row)
 
--- wrong number of exprs
+-- wrong number of exprs, too few
 SELECT pg_catalog.pg_restore_extended_stats(
   'schemaname', 'stats_import',
   'relname', 'test_clone',
@@ -2463,7 +2463,21 @@ SELECT pg_catalog.pg_restore_extended_stats(
   'statistics_name', 'test_stat_clone',
   'inherited', false,
   'exprs', '[ { "avg_width": "4" } ]'::jsonb);
-WARNING:  could not parse "exprs": incorrect number of elements (1 required)
+WARNING:  could not parse "exprs": incorrect number of elements (2 required)
+ pg_restore_extended_stats 
+---------------------------
+ f
+(1 row)
+
+-- wrong number of exprs, too many
+SELECT pg_catalog.pg_restore_extended_stats(
+  'schemaname', 'stats_import',
+  'relname', 'test_clone',
+  'statistics_schemaname', 'stats_import',
+  'statistics_name', 'test_stat_clone',
+  'inherited', false,
+  'exprs', '[ { "avg_width": "4" }, { "avg_width": "4" }, { "avg_width": "4" } ]'::jsonb);
+WARNING:  could not parse "exprs": incorrect number of elements (2 required)
  pg_restore_extended_stats 
 ---------------------------
  f
diff --git a/src/test/regress/sql/stats_import.sql b/src/test/regress/sql/stats_import.sql
index b74f6395a5e..6064b7722da 100644
--- a/src/test/regress/sql/stats_import.sql
+++ b/src/test/regress/sql/stats_import.sql
@@ -1750,7 +1750,7 @@ SELECT pg_catalog.pg_restore_extended_stats(
   'statistics_name', 'test_stat_clone',
   'inherited', false,
   'exprs', '{ "avg_width": "4", "null_frac": "0" }'::jsonb);
--- wrong number of exprs
+-- wrong number of exprs, too few
 SELECT pg_catalog.pg_restore_extended_stats(
   'schemaname', 'stats_import',
   'relname', 'test_clone',
@@ -1758,6 +1758,14 @@ SELECT pg_catalog.pg_restore_extended_stats(
   'statistics_name', 'test_stat_clone',
   'inherited', false,
   'exprs', '[ { "avg_width": "4" } ]'::jsonb);
+-- wrong number of exprs, too many
+SELECT pg_catalog.pg_restore_extended_stats(
+  'schemaname', 'stats_import',
+  'relname', 'test_clone',
+  'statistics_schemaname', 'stats_import',
+  'statistics_name', 'test_stat_clone',
+  'inherited', false,
+  'exprs', '[ { "avg_width": "4" }, { "avg_width": "4" }, { "avg_width": "4" } ]'::jsonb);
 -- incorrect type of value: should be a string or a NULL.
 SELECT pg_catalog.pg_restore_extended_stats(
   'schemaname', 'stats_import',
-- 
2.50.1 (Apple Git-155)



reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Fix small issues of pg_restore_extended_stats()
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox