public inbox for [email protected]
help / color / mirror / Atom feedRe: JumbleQuery ma treat different GROUP BY expr as the same
5+ messages / 3 participants
[nested] [flat]
* Re: JumbleQuery ma treat different GROUP BY expr as the same
@ 2026-01-12 08:20 jian he <[email protected]>
0 siblings, 1 reply; 5+ messages in thread
From: jian he @ 2026-01-12 08:20 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Michael Paquier <[email protected]>; pgsql-hackers; Richard Guo <[email protected]>
hi.
while working on it, I guess I found another bug, below JumbleQuery will return
the same result:
SELECT FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
SELECT a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
SELECT a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
so I think TargetEntry.resjunk should not be marked as query_jumble_ignore.
addRangeTableEntryForGroup
will make RangeTblEntry(RTE_GROUP) have a newly copied original groupexprs.
Query->targetList also has the original groupexprs.
but after parseCheckAggregates->substitute_grouped_columns.
Query->targetList Var node will point to the offset of the RTE_GROUP,
not the RTE_RELATION.
see src/backend/parser/parse_agg.c line 1543.
After parseCheckAggregates, JumbleQuery(Query->targetList) will produce the same
result as long as the grouping columns have the same list of data types.
JumbleQuery(Query->groupClause) will also produce the same result
as long as the grouping columns have the same list of data types.
Since only the RangeTblEntry(RTE_GROUP) have the original grouping expressions,
we can not mark the RangeTblEntry->groupexprs as query_jumble_ignore.
looking at, transformUpdateTargetList->transformTargetList
so i think it's OK to remove query_jumble_ignore from TargetEntry.resjunk for
INSERT/UPDATE/DELETE.
--
jian
https://www.enterprisedb.com/
Attachments:
[text/x-patch] v1-0001-Remove-query_jumble_ignore-from-the-fields-in-RangeTblEntry-and-T.patch (8.6K, 2-v1-0001-Remove-query_jumble_ignore-from-the-fields-in-RangeTblEntry-and-T.patch)
download | inline diff:
From 15623f1bea4b7ed6bf778113ad627765ef2f86a9 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Mon, 12 Jan 2026 16:17:37 +0800
Subject: [PATCH v1 1/1] Remove query_jumble_ignore from the fields in
RangeTblEntry and TargetEntry
discussion: https://postgr.es/m/CACJufxEy2W+tCqC7XuJ94r3ivWsM=onKJp94kRFx3hoARjBeFQ@mail.gmail.com
---
.../pg_stat_statements/expected/select.out | 112 +++++++++++++++++-
contrib/pg_stat_statements/sql/select.sql | 18 +++
src/include/nodes/parsenodes.h | 2 +-
src/include/nodes/primnodes.h | 2 +-
4 files changed, 130 insertions(+), 4 deletions(-)
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 75c896f3885..eafc54964dc 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -99,6 +99,22 @@ SELECT 2 AS "int" ORDER BY 1;
2
(1 row)
+SELECT FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+--
+(1 row)
+
+SELECT a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+ a
+---
+ 1
+(1 row)
+
+SELECT a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+ a | b
+---+---
+ 1 | 2
+(1 row)
+
/* this comment should not appear in the output */
SELECT 'hello'
-- but this one will appear
@@ -223,6 +239,9 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
1 | 2 | SELECT $1 AS i UNION SELECT $2 ORDER BY i
1 | 1 | SELECT $1 || $2
2 | 2 | SELECT DISTINCT $1 AS "int"
+ 1 | 1 | SELECT FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) ORDER BY a, b
+ 1 | 1 | SELECT a FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) ORDER BY a, b
+ 1 | 1 | SELECT a, b FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) ORDER BY a, b
0 | 0 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"
1 | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
1 | 2 | WITH t(f) AS ( +
@@ -230,7 +249,7 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
| | ) +
| | SELECT f FROM t ORDER BY f
1 | 1 | select $1::jsonb ? $2
-(17 rows)
+(20 rows)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
@@ -459,6 +478,95 @@ SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%FETCH FIRST%';
2
(1 row)
+-- GROUP BY, HAVING, GROUPING
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
+ count
+-------
+ 1
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
+ count
+-------
+ 1
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(a, ());
+ count
+-------
+ 1
+ 1
+(2 rows)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(b, ());
+ count
+-------
+ 1
+ 1
+(2 rows)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 1;
+ count
+-------
+ 1
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b HAVING b = 1;
+ count
+-------
+(0 rows)
+
+SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 1;
+ count | a
+-------+---
+ 1 | 1
+(1 row)
+
+SELECT COUNT(*), b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b HAVING b = 1;
+ count | b
+-------+---
+(0 rows)
+
+SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 1 ORDER BY a;
+ count | a
+-------+---
+ 1 | 1
+(1 row)
+
+SELECT COUNT(*), b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b HAVING b = 1 ORDER BY b;
+ count | b
+-------+---
+(0 rows)
+
+SELECT GROUPING(a) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
+ grouping
+----------
+ 0
+(1 row)
+
+SELECT GROUPING(b) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
+ grouping
+----------
+ 0
+(1 row)
+
+SELECT calls, query FROM pg_stat_statements WHERE query LIKE '%GROUP BY%' ORDER BY query COLLATE "C";
+ calls | query
+-------+----------------------------------------------------------------------------------------------------
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY GROUPING SETS(a, ())
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY GROUPING SETS(b, ())
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a HAVING a = $3
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b HAVING b = $3
+ 1 | SELECT COUNT(*), a FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a HAVING a = $3
+ 1 | SELECT COUNT(*), a FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a HAVING a = $3 ORDER BY a
+ 1 | SELECT COUNT(*), b FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b HAVING b = $3
+ 1 | SELECT COUNT(*), b FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b HAVING b = $3 ORDER BY b
+ 1 | SELECT GROUPING(a) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a
+ 1 | SELECT GROUPING(b) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b
+(12 rows)
+
-- GROUP BY [DISTINCT]
SELECT a, b, c
FROM (VALUES (1, 2, 3), (4, NULL, 6), (7, 8, 9)) AS t (a, b, c)
@@ -548,7 +656,7 @@ SELECT (
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%SELECT GROUPING%';
count
-------
- 2
+ 4
(1 row)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index 11662cde08c..33e0e63cd1e 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -31,6 +31,9 @@ SELECT DISTINCT 1 AS "int";
SELECT DISTINCT 2 AS "int";
SELECT 1 AS "int" ORDER BY 1;
SELECT 2 AS "int" ORDER BY 1;
+SELECT FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+SELECT a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+SELECT a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
/* this comment should not appear in the output */
SELECT 'hello'
@@ -158,6 +161,21 @@ FETCH FIRST 2 ROW ONLY;
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%FETCH FIRST%';
+-- GROUP BY, HAVING, GROUPING
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(a, ());
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(b, ());
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 1;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b HAVING b = 1;
+SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 1;
+SELECT COUNT(*), b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b HAVING b = 1;
+SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 1 ORDER BY a;
+SELECT COUNT(*), b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b HAVING b = 1 ORDER BY b;
+SELECT GROUPING(a) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
+SELECT GROUPING(b) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
+SELECT calls, query FROM pg_stat_statements WHERE query LIKE '%GROUP BY%' ORDER BY query COLLATE "C";
+
-- GROUP BY [DISTINCT]
SELECT a, b, c
FROM (VALUES (1, 2, 3), (4, NULL, 6), (7, 8, 9)) AS t (a, b, c)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aac4bfc70d9..646d6ced763 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1294,7 +1294,7 @@ typedef struct RangeTblEntry
* Fields valid for a GROUP RTE (else NIL):
*/
/* list of grouping expressions */
- List *groupexprs pg_node_attr(query_jumble_ignore);
+ List *groupexprs;
/*
* Fields valid in all RTEs:
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 5211cadc258..174ebe92939 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -2248,7 +2248,7 @@ typedef struct TargetEntry
/* column's number in source table */
AttrNumber resorigcol pg_node_attr(query_jumble_ignore);
/* set to true to eliminate the attribute from final target list */
- bool resjunk pg_node_attr(query_jumble_ignore);
+ bool resjunk;
} TargetEntry;
--
2.34.1
^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: JumbleQuery ma treat different GROUP BY expr as the same
@ 2026-01-13 07:09 Michael Paquier <[email protected]>
parent: jian he <[email protected]>
0 siblings, 2 replies; 5+ messages in thread
From: Michael Paquier @ 2026-01-13 07:09 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; Richard Guo <[email protected]>
On Mon, Jan 12, 2026 at 04:20:44PM +0800, jian he wrote:
> While working on it, I guess I found another bug, below JumbleQuery will return
> the same result:
>
> SELECT FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
> SELECT a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
> SELECT a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
>
> so I think TargetEntry.resjunk should not be marked as query_jumble_ignore.
Not sure how to feel about this one, as a primary node. 3db72ebcbe20
has put a query_jumble_ignore to TargetEntry.resjunk which was simply
a consistent move with the pre-v15 branches because these columns have
always been ignored. I have never heard complaints about that in the
field with PGSS, TBH. The original choice comes from this thread,
back in 2012 when this was still integrated into PGSS:
https://www.postgresql.org/message-id/CAEYLb_WGeFCT7MfJ8FXf-CR6BSE6Lbn%2BO1VX3%2BOGrc4Bscn4%3DA%40ma...
Anyway, let's not mix apples and oranges for now. The GROUP BY issue
is a bug worth fixing on its own. What you are pointing out with
resjunk is the original behavior we have been relying on. If we
finish by changing it, this should not and cannot be backpatched.
I have expanded a bit the tests, with a couple of extra patterns,
giving the attached. The behavior is the same as the pre-v17
branches.
--
Michael
Attachments:
[text/x-diff] v2-0001-Fix-query-jumbling-computations-with-GROUP-BY-cla.patch (6.1K, 2-v2-0001-Fix-query-jumbling-computations-with-GROUP-BY-cla.patch)
download | inline diff:
From 0da22db67538f3ead39c4a4fcbe2afc3a513454f Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Mon, 12 Jan 2026 16:17:37 +0800
Subject: [PATCH v2] Fix query jumbling computations with GROUP BY clauses
Discussion: https://postgr.es/m/CACJufxEy2W+tCqC7XuJ94r3ivWsM=onKJp94kRFx3hoARjBeFQ@mail.gmail.com
---
src/include/nodes/parsenodes.h | 2 +-
.../pg_stat_statements/expected/select.out | 98 ++++++++++++++++++-
contrib/pg_stat_statements/sql/select.sql | 16 +++
3 files changed, 114 insertions(+), 2 deletions(-)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aac4bfc70d99..646d6ced763c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1294,7 +1294,7 @@ typedef struct RangeTblEntry
* Fields valid for a GROUP RTE (else NIL):
*/
/* list of grouping expressions */
- List *groupexprs pg_node_attr(query_jumble_ignore);
+ List *groupexprs;
/*
* Fields valid in all RTEs:
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 75c896f38851..a069119c7900 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -459,6 +459,102 @@ SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%FETCH FIRST%';
2
(1 row)
+-- GROUP BY, HAVING, GROUPING
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
+ count
+-------
+ 1
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
+ count
+-------
+ 1
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+ count
+-------
+ 1
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b, a;
+ count
+-------
+ 1
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(a, ());
+ count
+-------
+ 1
+ 1
+(2 rows)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(b, ());
+ count
+-------
+ 1
+ 1
+(2 rows)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 1;
+ count
+-------
+ 1
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 2;
+ count
+-------
+(0 rows)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b HAVING b = 1;
+ count
+-------
+(0 rows)
+
+SELECT GROUPING(a) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
+ grouping
+----------
+ 0
+(1 row)
+
+SELECT GROUPING(b) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
+ grouping
+----------
+ 0
+(1 row)
+
+SELECT GROUPING(b) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+ grouping
+----------
+ 0
+(1 row)
+
+SELECT GROUPING(b) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b, a;
+ grouping
+----------
+ 0
+(1 row)
+
+SELECT calls, query FROM pg_stat_statements WHERE query LIKE '%GROUP BY%' ORDER BY query COLLATE "C";
+ calls | query
+-------+-------------------------------------------------------------------------------------------
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY GROUPING SETS(a, ())
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY GROUPING SETS(b, ())
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a
+ 2 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a HAVING a = $3
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a, b
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b HAVING b = $3
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b, a
+ 1 | SELECT GROUPING(a) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a
+ 1 | SELECT GROUPING(b) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a, b
+ 1 | SELECT GROUPING(b) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b
+ 1 | SELECT GROUPING(b) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b, a
+(12 rows)
+
-- GROUP BY [DISTINCT]
SELECT a, b, c
FROM (VALUES (1, 2, 3), (4, NULL, 6), (7, 8, 9)) AS t (a, b, c)
@@ -548,7 +644,7 @@ SELECT (
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%SELECT GROUPING%';
count
-------
- 2
+ 6
(1 row)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index 11662cde08c9..a10d618c034e 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -158,6 +158,22 @@ FETCH FIRST 2 ROW ONLY;
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%FETCH FIRST%';
+-- GROUP BY, HAVING, GROUPING
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b, a;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(a, ());
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(b, ());
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 1;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a HAVING a = 2;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b HAVING b = 1;
+SELECT GROUPING(a) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
+SELECT GROUPING(b) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
+SELECT GROUPING(b) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+SELECT GROUPING(b) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b, a;
+SELECT calls, query FROM pg_stat_statements WHERE query LIKE '%GROUP BY%' ORDER BY query COLLATE "C";
+
-- GROUP BY [DISTINCT]
SELECT a, b, c
FROM (VALUES (1, 2, 3), (4, NULL, 6), (7, 8, 9)) AS t (a, b, c)
--
2.51.0
[application/pgp-signature] signature.asc (833B, 3-signature.asc)
download
^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: JumbleQuery ma treat different GROUP BY expr as the same
@ 2026-01-13 23:46 Michael Paquier <[email protected]>
parent: Michael Paquier <[email protected]>
1 sibling, 0 replies; 5+ messages in thread
From: Michael Paquier @ 2026-01-13 23:46 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; Richard Guo <[email protected]>
On Tue, Jan 13, 2026 at 04:09:46PM +0900, Michael Paquier wrote:
> I have expanded a bit the tests, with a couple of extra patterns,
> giving the attached. The behavior is the same as the pre-v17
> branches.
Done this part for the GROUP BY regression now, as of e217dc7484e5,
down to v18.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: JumbleQuery ma treat different GROUP BY expr as the same
@ 2026-01-14 06:15 jian he <[email protected]>
parent: Michael Paquier <[email protected]>
1 sibling, 1 reply; 5+ messages in thread
From: jian he @ 2026-01-14 06:15 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; Richard Guo <[email protected]>
On Tue, Jan 13, 2026 at 3:10 PM Michael Paquier <[email protected]> wrote:
>
> On Mon, Jan 12, 2026 at 04:20:44PM +0800, jian he wrote:
> > While working on it, I guess I found another bug, below JumbleQuery will return
> > the same result:
> >
> > SELECT FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
> > SELECT a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
> > SELECT a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
> >
> > so I think TargetEntry.resjunk should not be marked as query_jumble_ignore.
>
> Not sure how to feel about this one, as a primary node. 3db72ebcbe20
> has put a query_jumble_ignore to TargetEntry.resjunk which was simply
> a consistent move with the pre-v15 branches because these columns have
> always been ignored. I have never heard complaints about that in the
> field with PGSS, TBH. The original choice comes from this thread,
> back in 2012 when this was still integrated into PGSS:
> https://www.postgresql.org/message-id/CAEYLb_WGeFCT7MfJ8FXf-CR6BSE6Lbn%2BO1VX3%2BOGrc4Bscn4%3DA%40ma...
>
if not remove the query_jumble_ignore from TargetEntry.resjunk
the below query would have the same QueryID.
SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
SELECT COUNT(*), a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
It affects queries that include an ORDER BY or GROUP BY clause, so a patch is
attached.
--
jian
https://www.enterprisedb.com/
Attachments:
[text/x-patch] v3-0001-Fix-query-jumbling-for-TargetEntry.resjunk.patch (6.8K, 2-v3-0001-Fix-query-jumbling-for-TargetEntry.resjunk.patch)
download | inline diff:
From 7da8cf6a17a72522bf7120a2db2f6d8ceb14f837 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Wed, 14 Jan 2026 13:30:39 +0800
Subject: [PATCH v3 1/1] Fix query jumbling for TargetEntry.resjunk
TargetEntry.resjunk was marked with the node attribute query_jumble_ignore,
causing such target list entry to be ignored during the query jumbling.
For example, these two queries could be grouped together within the same query ID:
SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
SELECT COUNT(*), a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
However, as such queries have different semantics, they should be split across
multiple entries.
We do need to mark the following two queries as identical.
SELECT COUNT(*), a as b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
Therefore TargetEntry.resname marked as query_jumble_ignore is correct.
Discussion: https://postgr.es/m/CACJufxEy2W+tCqC7XuJ94r3ivWsM=onKJp94kRFx3hoARjBeFQ@mail.gmail.com
---
.../pg_stat_statements/expected/select.out | 44 ++++++++++++++++++-
contrib/pg_stat_statements/sql/select.sql | 6 +++
src/include/nodes/primnodes.h | 2 +-
3 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index a069119c790..6c391d43f1c 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -99,6 +99,22 @@ SELECT 2 AS "int" ORDER BY 1;
2
(1 row)
+SELECT FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+--
+(1 row)
+
+SELECT a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+ a
+---
+ 1
+(1 row)
+
+SELECT a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+ a | b
+---+---
+ 1 | 2
+(1 row)
+
/* this comment should not appear in the output */
SELECT 'hello'
-- but this one will appear
@@ -223,6 +239,9 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
1 | 2 | SELECT $1 AS i UNION SELECT $2 ORDER BY i
1 | 1 | SELECT $1 || $2
2 | 2 | SELECT DISTINCT $1 AS "int"
+ 1 | 1 | SELECT FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) ORDER BY a, b
+ 1 | 1 | SELECT a FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) ORDER BY a, b
+ 1 | 1 | SELECT a, b FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) ORDER BY a, b
0 | 0 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"
1 | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
1 | 2 | WITH t(f) AS ( +
@@ -230,7 +249,7 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
| | ) +
| | SELECT f FROM t ORDER BY f
1 | 1 | select $1::jsonb ? $2
-(17 rows)
+(20 rows)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
@@ -478,6 +497,24 @@ SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
1
(1 row)
+SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+ count | a
+-------+---
+ 1 | 1
+(1 row)
+
+SELECT COUNT(*), a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+ count | a | b
+-------+---+---
+ 1 | 1 | 2
+(1 row)
+
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b ORDER BY a, b;
+ count
+-------
+ 1
+(1 row)
+
SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b, a;
count
-------
@@ -546,14 +583,17 @@ SELECT calls, query FROM pg_stat_statements WHERE query LIKE '%GROUP BY%' ORDER
1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a
2 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a HAVING a = $3
1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a, b
+ 1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a, b ORDER BY a, b
1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b
1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b HAVING b = $3
1 | SELECT COUNT(*) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b, a
+ 1 | SELECT COUNT(*), a FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a, b
+ 1 | SELECT COUNT(*), a, b FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a, b
1 | SELECT GROUPING(a) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a
1 | SELECT GROUPING(b) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY a, b
1 | SELECT GROUPING(b) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b
1 | SELECT GROUPING(b) FROM (VALUES ($1::INT, $2::INT)) AS t(a, b) GROUP BY b, a
-(12 rows)
+(15 rows)
-- GROUP BY [DISTINCT]
SELECT a, b, c
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index a10d618c034..6c826e61870 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -31,6 +31,9 @@ SELECT DISTINCT 1 AS "int";
SELECT DISTINCT 2 AS "int";
SELECT 1 AS "int" ORDER BY 1;
SELECT 2 AS "int" ORDER BY 1;
+SELECT FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+SELECT a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
+SELECT a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) ORDER BY a, b;
/* this comment should not appear in the output */
SELECT 'hello'
@@ -162,6 +165,9 @@ SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%FETCH FIRST%';
SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a;
SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b;
SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+SELECT COUNT(*), a FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+SELECT COUNT(*), a, b FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b;
+SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY a, b ORDER BY a, b;
SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY b, a;
SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(a, ());
SELECT COUNT(*) FROM (VALUES (1::INT, 2::INT)) AS t(a, b) GROUP BY GROUPING SETS(b, ());
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 5211cadc258..174ebe92939 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -2248,7 +2248,7 @@ typedef struct TargetEntry
/* column's number in source table */
AttrNumber resorigcol pg_node_attr(query_jumble_ignore);
/* set to true to eliminate the attribute from final target list */
- bool resjunk pg_node_attr(query_jumble_ignore);
+ bool resjunk;
} TargetEntry;
--
2.34.1
^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: JumbleQuery ma treat different GROUP BY expr as the same
@ 2026-03-12 07:47 Keisuke Kuroda <[email protected]>
parent: jian he <[email protected]>
0 siblings, 0 replies; 5+ messages in thread
From: Keisuke Kuroda @ 2026-03-12 07:47 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: Michael Paquier <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers; Richard Guo <[email protected]>
Hi hackers,
In previous versions, changing the column order in the SELECT clause
also resulted in a different query_id.The test results are in the
attached queryidtest.txt file.
If TargetEntry.resjunk is not marked as query_jumble_ignore, it feels
a bit off that the query_id differs even when the execution plan is
identical, including the output order.
However, it is also true that queries with different column
specifications in the SQL SELECT clause are collected as identical
SQL. To distinguish and collect information on these, modifications
like this patch are necessary.
--
Keisuke Kuroda
NTT DOCOMO SOLUTIONS, Inc.
## PostgreSQL 18.3
CREATE TABLE t(c1 int,c2 int);
postgres=# EXPLAIN(VERBOSE) SELECT c1, c2 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c1, c2
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c1, c2
Query Identifier: -4675081474881745094
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c2, c1 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c2, c1
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c2, c1
Query Identifier: -3269814016130981549
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c1 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c1, c2
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c1, c2
Query Identifier: -4675081474881745094
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c2 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c2, c1
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c2, c1
Query Identifier: -3269814016130981549
(6 rows)
## PostreSQL 19dev(d841ca2d149666b8) with Patch
CREATE TABLE t(c1 int,c2 int);
postgres=# EXPLAIN(VERBOSE) SELECT c1, c2 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c1, c2
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c1, c2
Query Identifier: 3650220897747797500
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c2, c1 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c2, c1
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c2, c1
Query Identifier: -3038222286574995023
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c1 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c1, c2
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c1, c2
Query Identifier: -8371868852328991336
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c2 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c2, c1
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c2, c1
Query Identifier: -8689818593596346321
(6 rows)
Attachments:
[text/plain] queryidtest.txt (2.8K, 2-queryidtest.txt)
download | inline:
## PostgreSQL 18.3
CREATE TABLE t(c1 int,c2 int);
postgres=# EXPLAIN(VERBOSE) SELECT c1, c2 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c1, c2
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c1, c2
Query Identifier: -4675081474881745094
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c2, c1 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c2, c1
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c2, c1
Query Identifier: -3269814016130981549
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c1 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c1, c2
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c1, c2
Query Identifier: -4675081474881745094
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c2 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c2, c1
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c2, c1
Query Identifier: -3269814016130981549
(6 rows)
## PostreSQL 19dev(d841ca2d149666b8) with Patch
CREATE TABLE t(c1 int,c2 int);
postgres=# EXPLAIN(VERBOSE) SELECT c1, c2 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c1, c2
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c1, c2
Query Identifier: 3650220897747797500
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c2, c1 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c2, c1
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c2, c1
Query Identifier: -3038222286574995023
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c1 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c1, c2
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c1, c2
Query Identifier: -8371868852328991336
(6 rows)
postgres=# EXPLAIN(VERBOSE) SELECT c2 FROM t ORDER BY c1,c2;
QUERY PLAN
------------------------------------------------------------------
Sort (cost=158.51..164.16 rows=2260 width=8)
Output: c2, c1
Sort Key: t.c1, t.c2
-> Seq Scan on public.t (cost=0.00..32.60 rows=2260 width=8)
Output: c2, c1
Query Identifier: -8689818593596346321
(6 rows)
^ permalink raw reply [nested|flat] 5+ messages in thread
end of thread, other threads:[~2026-03-12 07:47 UTC | newest]
Thread overview: 5+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-01-12 08:20 Re: JumbleQuery ma treat different GROUP BY expr as the same jian he <[email protected]>
2026-01-13 07:09 ` Michael Paquier <[email protected]>
2026-01-13 23:46 ` Michael Paquier <[email protected]>
2026-01-14 06:15 ` jian he <[email protected]>
2026-03-12 07:47 ` Keisuke Kuroda <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox