public inbox for [email protected]  
help / color / mirror / Atom feed
pgsql: Introduce squashing of constant lists in query jumbling
10+ messages / 6 participants
[nested] [flat]

* pgsql: Introduce squashing of constant lists in query jumbling
@ 2025-03-18 17:58 Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Álvaro Herrera @ 2025-03-18 17:58 UTC (permalink / raw)
  To: [email protected]

Introduce squashing of constant lists in query jumbling

pg_stat_statements produces multiple entries for queries like
    SELECT something FROM table WHERE col IN (1, 2, 3, ...)

depending on the number of parameters, because every element of
ArrayExpr is individually jumbled.  Most of the time that's undesirable,
especially if the list becomes too large.

Fix this by introducing a new GUC query_id_squash_values which modifies
the node jumbling code to only consider the first and last element of a
list of constants, rather than each list element individually.  This
affects both the query_id generated by query jumbling, as well as
pg_stat_statements query normalization so that it suppresses printing of
the individual elements of such a list.

The default value is off, meaning the previous behavior is maintained.

Author: Dmitry Dolgov <[email protected]>
Reviewed-by: Sergey Dudoladov (mysterious, off-list)
Reviewed-by: David Geier <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Álvaro Herrera <[email protected]>
Reviewed-by: Sami Imseih <[email protected]>
Reviewed-by: Sutou Kouhei <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Reviewed-by: Marcos Pegoraro <[email protected]>
Reviewed-by: Julien Rouhaud <[email protected]>
Reviewed-by: Zhihong Yu <[email protected]>
Tested-by: Yasuo Honda <[email protected]>
Tested-by: Sergei Kornilov <[email protected]>
Tested-by: Maciek Sakrejda <[email protected]>
Tested-by: Chengxi Sun <[email protected]>
Tested-by: Jakub Wartak <[email protected]>
Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/62d712ecfd940f60e68bde5b6972b6859937c412

Modified Files
--------------
contrib/pg_stat_statements/Makefile               |   2 +-
contrib/pg_stat_statements/expected/squashing.out | 464 ++++++++++++++++++++++
contrib/pg_stat_statements/meson.build            |   1 +
contrib/pg_stat_statements/pg_stat_statements.c   |  76 +++-
contrib/pg_stat_statements/sql/squashing.sql      | 180 +++++++++
doc/src/sgml/config.sgml                          |  30 ++
doc/src/sgml/pgstatstatements.sgml                |  24 +-
src/backend/nodes/gen_node_support.pl             |  19 +-
src/backend/nodes/queryjumblefuncs.c              | 146 ++++++-
src/backend/postmaster/launch_backend.c           |   3 +
src/backend/utils/misc/guc_tables.c               |  10 +
src/backend/utils/misc/postgresql.conf.sample     |   1 +
src/include/nodes/nodes.h                         |   2 +
src/include/nodes/primnodes.h                     |   2 +-
src/include/nodes/queryjumble.h                   |   7 +
15 files changed, 945 insertions(+), 22 deletions(-)



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
@ 2025-03-25 16:28 ` Christoph Berg <[email protected]>
  2025-03-25 16:53   ` Re: Squash constant lists in query jumbling by default Tom Lane <[email protected]>
  2025-03-25 21:24   ` Re: Squash constant lists in query jumbling by default Laurenz Albe <[email protected]>
  0 siblings, 2 replies; 10+ messages in thread

From: Christoph Berg @ 2025-03-25 16:28 UTC (permalink / raw)
  To: [email protected]; +Cc: Álvaro Herrera <[email protected]>; Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Tom Lane <[email protected]>; Julien Rouhaud <[email protected]>

Re: Álvaro Herrera
> Introduce squashing of constant lists in query jumbling
> 
> pg_stat_statements produces multiple entries for queries like
>     SELECT something FROM table WHERE col IN (1, 2, 3, ...)
> 
> depending on the number of parameters, because every element of
> ArrayExpr is individually jumbled.  Most of the time that's undesirable,
> especially if the list becomes too large.
> 
> Fix this by introducing a new GUC query_id_squash_values which modifies
> the node jumbling code to only consider the first and last element of a
> list of constants, rather than each list element individually.  This
> affects both the query_id generated by query jumbling, as well as
> pg_stat_statements query normalization so that it suppresses printing of
> the individual elements of such a list.
> 
> The default value is off, meaning the previous behavior is maintained.

The "jumble names of temp tables" thread was briefly touching this [1],
I'm starting a new thread since the others are already very long.

[1] https://www.postgresql.org/message-id/flat/CAA5RZ0uNofEXfEfNw3uRN3D3oXkFPQ_s%2BhuLLHMKR_%2BMCk8RPQ%4...

Two points were made:

1) this should be on by default
2) there should be no GUC for it.

For 1), Sami said "Why would anyone not want to squash the IN list?"
to which I can only agree. Michael agreed as well, that's already +3.

For 2), Tom said that configurability is 1) often much less useful
than originally planned, and 2) tools have to cope with both settings
anyway, making implementing them harder. Plus, switching at run-time
makes the result even less predictable.

So, I would propose that we drop the GUC and make it the default.
Opinions?

Christoph





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
@ 2025-03-25 16:53   ` Tom Lane <[email protected]>
  2025-03-25 17:27     ` Re: Squash constant lists in query jumbling by default Álvaro Herrera <[email protected]>
  1 sibling, 1 reply; 10+ messages in thread

From: Tom Lane @ 2025-03-25 16:53 UTC (permalink / raw)
  To: Christoph Berg <[email protected]>; +Cc: [email protected]; Álvaro Herrera <[email protected]>; Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Julien Rouhaud <[email protected]>

Christoph Berg <[email protected]> writes:
> For 2), Tom said that configurability is 1) often much less useful
> than originally planned, and 2) tools have to cope with both settings
> anyway, making implementing them harder. Plus, switching at run-time
> makes the result even less predictable.

To clarify that last bit: if some clients run with the GUC on and some
with it off, you have a mess.  Even statements that are completely
identical will have different query IDs under the two settings.

If this GUC sticks around, it should be at least PGC_SUSET (on
the analogy of compute_query_id) to make it harder to break
pg_stat_statements that way.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
  2025-03-25 16:53   ` Re: Squash constant lists in query jumbling by default Tom Lane <[email protected]>
@ 2025-03-25 17:27     ` Álvaro Herrera <[email protected]>
  2025-03-25 17:55       ` Re: Squash constant lists in query jumbling by default Sami Imseih <[email protected]>
  2025-03-25 18:40       ` Re: Squash constant lists in query jumbling by default Dmitry Dolgov <[email protected]>
  0 siblings, 2 replies; 10+ messages in thread

From: Álvaro Herrera @ 2025-03-25 17:27 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Christoph Berg <[email protected]>; [email protected]; Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Julien Rouhaud <[email protected]>; Dmitry Dolgov <[email protected]>

On 2025-Mar-25, Tom Lane wrote:

> Christoph Berg <[email protected]> writes:
> > For 2), Tom said that configurability is 1) often much less useful
> > than originally planned, and 2) tools have to cope with both settings
> > anyway, making implementing them harder. Plus, switching at run-time
> > makes the result even less predictable.
> 
> To clarify that last bit: if some clients run with the GUC on and some
> with it off, you have a mess.  Even statements that are completely
> identical will have different query IDs under the two settings.

True.

> If this GUC sticks around, it should be at least PGC_SUSET (on
> the analogy of compute_query_id) to make it harder to break
> pg_stat_statements that way.

I have no problem making it superuser-only, and I can see making "on" be
the default.  I am not opposed to removing it completely either, if we
really think that the current behavior is no longer useful for anybody.

Earlier in the discussion, other possible values for the option were
suggested, such as a way to distinguish arrays that had "lots" (say,
hundreds or more) of entries from arrays that were "small".  That could
be selected by the user (or site admin) using this GUC, though there was
no agreement on exactly what that would be.  During the FOSDEM 2024
development meeting there was a general dislike of this idea, which
AFAIR was mostly predicated on the displayed query no longer being valid
SQL.  But now that we've chosen a format that uses SQL comments, this is
no longer a problem, so I think we haven't closed that door yet.  But we
may still find out that no user cares about this.

Dmitry?

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Los dioses no protegen a los insensatos.  Éstos reciben protección de
otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
  2025-03-25 16:53   ` Re: Squash constant lists in query jumbling by default Tom Lane <[email protected]>
  2025-03-25 17:27     ` Re: Squash constant lists in query jumbling by default Álvaro Herrera <[email protected]>
@ 2025-03-25 17:55       ` Sami Imseih <[email protected]>
  1 sibling, 0 replies; 10+ messages in thread

From: Sami Imseih @ 2025-03-25 17:55 UTC (permalink / raw)
  To: Álvaro Herrera <[email protected]>; +Cc: Tom Lane <[email protected]>; Christoph Berg <[email protected]>; [email protected]; Michael Paquier <[email protected]>; Julien Rouhaud <[email protected]>; Dmitry Dolgov <[email protected]>

> > If this GUC sticks around, it should be at least PGC_SUSET (on
> > the analogy of compute_query_id) to make it harder to break
> > pg_stat_statements that way.
>
> I have no problem making it superuser-only, and I can see making "on" be
> the default.  I am not opposed to removing it completely either, if we
> really think that the current behavior is no longer useful for anybody.

I am in favor of complete removal. [1] will change the behavior of table
jumbling without introducing a GUC, and I don't think we should introduce
a GUC for the squash values case either. Why one behavior change is configurable
while the other is not? seems confusing, IMO.

Also, as a matter of principle, it seems most are favoring not
introducing GUCs to
configure queryId behavior. I agree.

[1] https://www.postgresql.org/message-id/flat/CAA5RZ0uNofEXfEfNw3uRN3D3oXkFPQ_s%2BhuLLHMKR_%2BMCk8RPQ%4...

--
Sami Imseih
Amazon Web Services (AWS)





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
  2025-03-25 16:53   ` Re: Squash constant lists in query jumbling by default Tom Lane <[email protected]>
  2025-03-25 17:27     ` Re: Squash constant lists in query jumbling by default Álvaro Herrera <[email protected]>
@ 2025-03-25 18:40       ` Dmitry Dolgov <[email protected]>
  2025-03-25 19:33         ` Re: Squash constant lists in query jumbling by default Sami Imseih <[email protected]>
  2025-03-25 20:05         ` Re: Squash constant lists in query jumbling by default Dmitry Dolgov <[email protected]>
  1 sibling, 2 replies; 10+ messages in thread

From: Dmitry Dolgov @ 2025-03-25 18:40 UTC (permalink / raw)
  To: Álvaro Herrera <[email protected]>; +Cc: Tom Lane <[email protected]>; Christoph Berg <[email protected]>; PostgreSQL Developers <[email protected]>; Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Julien Rouhaud <[email protected]>

>
>
>> On Tue, Mar 25, 2025, 6:28 PM Álvaro Herrera <[email protected]>
wrote:

On 2025-Mar-25, Tom Lane wrote:

> If this GUC sticks around, it should be at least PGC_SUSET (on

> the analogy of compute_query_id) to make it harder to break

> pg_stat_statements that way.


I have no problem making it superuser-only, and I can see making "on" be

the default.  I am not opposed to removing it completely either, if we

really think that the current behavior is no longer useful for anybody.


I'm in favor of removing the GUC of course, but if memory serves there
were some folks in the patch discussion thread, who claimed they would
need to be able to keep non-squashed behavior. I don't recall if there were
particular arguments to support that, will try to find those messages again.
But overall as long as nobody objects, I think it's fine to get rid of GUC.

Earlier in the discussion, other possible values for the option were
suggested, such as a way to distinguish arrays that had "lots" (say,
hundreds or more) of entries from arrays that were "small".  That could
be selected by the user (or site admin) using this GUC, though there was
no agreement on exactly what that would be.  During the FOSDEM 2024
development meeting there was a general dislike of this idea, which
AFAIR was mostly predicated on the displayed query no longer being valid
SQL.  But now that we've chosen a format that uses SQL comments, this is
no longer a problem, so I think we haven't closed that door yet.  But we
may still find out that no user cares about this.


Agree, the way how things work now brings this option back on the table.
I can refresh the patch doing this, but I'm afk for about a week so it will
take some time. At the same time the proposal to do squashing by default
does not seem to be strictly dependent on that, so maybe they could be
considered as isolated ideas.

>


^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
  2025-03-25 16:53   ` Re: Squash constant lists in query jumbling by default Tom Lane <[email protected]>
  2025-03-25 17:27     ` Re: Squash constant lists in query jumbling by default Álvaro Herrera <[email protected]>
  2025-03-25 18:40       ` Re: Squash constant lists in query jumbling by default Dmitry Dolgov <[email protected]>
@ 2025-03-25 19:33         ` Sami Imseih <[email protected]>
  1 sibling, 0 replies; 10+ messages in thread

From: Sami Imseih @ 2025-03-25 19:33 UTC (permalink / raw)
  To: Dmitry Dolgov <[email protected]>; +Cc: Álvaro Herrera <[email protected]>; Tom Lane <[email protected]>; Christoph Berg <[email protected]>; PostgreSQL Developers <[email protected]>; Michael Paquier <[email protected]>; Julien Rouhaud <[email protected]>

> At the same time the proposal to do squashing by default
> does not seem to be strictly dependent on that, so maybe they could be
> considered as isolated ideas.

Here is a patch to remove the GUC, if we settle on doing so.

--
Sami Imseih
Amazon Web Services (AWS)


Attachments:

  [application/octet-stream] v1-0001-Remove-the-query_id_squash_values-GUC.patch (12.3K, 2-v1-0001-Remove-the-query_id_squash_values-GUC.patch)
  download | inline diff:
From 154ed3caa2df5377504b52507385ae5d5a38e7ba Mon Sep 17 00:00:00 2001
From: "Sami Imseih (AWS)"
 <[email protected]>
Date: Tue, 25 Mar 2025 19:18:08 +0000
Subject: [PATCH v1 1/1] Remove the query_id_squash_values GUC

62d712ecfd94 introduced the capability to calculate the same queryId
for queries with different lengths of constants in a list for an IN clause.
This behavior was originally enabled with a GUC query_id_squash_values. After
a discussion about the value of such a GUC, it was decided to back out of
the use of a GUC and make the squashing behavior the only available option.

Discussion: https://www.postgresql.org/message-id/flat/CA%2Bq6zcVTK-3C-8NWV1oY2NZrvtnMCDqnyYYyk1T7WMUG65MeOQ%40mail.gmail.com#3eaafc6125eb61b869b1218438480640
---
 contrib/pg_stat_statements/expected/dml.out   |  2 +-
 .../pg_stat_statements/expected/squashing.out | 33 -------------------
 contrib/pg_stat_statements/sql/squashing.sql  | 10 ------
 doc/src/sgml/config.sgml                      | 30 -----------------
 doc/src/sgml/pgstatstatements.sgml            |  6 ++--
 src/backend/nodes/queryjumblefuncs.c          | 16 +++------
 src/backend/postmaster/launch_backend.c       |  3 --
 src/backend/utils/misc/guc_tables.c           | 10 ------
 src/backend/utils/misc/postgresql.conf.sample |  1 -
 src/include/nodes/queryjumble.h               |  1 -
 10 files changed, 8 insertions(+), 104 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/dml.out b/contrib/pg_stat_statements/expected/dml.out
index acc2c5e5241..347cb8699e4 100644
--- a/contrib/pg_stat_statements/expected/dml.out
+++ b/contrib/pg_stat_statements/expected/dml.out
@@ -80,7 +80,7 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
      1 |   10 | INSERT INTO pgss_dml_tab VALUES(generate_series($1, $2), $3)
      1 |   12 | SELECT * FROM pgss_dml_tab ORDER BY a
      2 |    4 | SELECT * FROM pgss_dml_tab WHERE a > $1 ORDER BY a
-     1 |    8 | SELECT * FROM pgss_dml_tab WHERE a IN ($1, $2, $3, $4, $5)
+     1 |    8 | SELECT * FROM pgss_dml_tab WHERE a IN ($1 /*, ... */)
      1 |    1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
      1 |    0 | SET pg_stat_statements.track_utility = $1
      6 |    6 | UPDATE pgss_dml_tab SET b = $1 WHERE a = $2
diff --git a/contrib/pg_stat_statements/expected/squashing.out b/contrib/pg_stat_statements/expected/squashing.out
index 55aa5109433..7b138af098c 100644
--- a/contrib/pg_stat_statements/expected/squashing.out
+++ b/contrib/pg_stat_statements/expected/squashing.out
@@ -4,39 +4,7 @@
 CREATE EXTENSION pg_stat_statements;
 CREATE TABLE test_squash (id int, data int);
 -- IN queries
--- No squashing is performed, as a baseline result
-SELECT pg_stat_statements_reset() IS NOT NULL AS t;
- t 
----
- t
-(1 row)
-
-SELECT * FROM test_squash WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9);
- id | data 
-----+------
-(0 rows)
-
-SELECT * FROM test_squash WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
- id | data 
-----+------
-(0 rows)
-
-SELECT * FROM test_squash WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11);
- id | data 
-----+------
-(0 rows)
-
-SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
-                                        query                                         | calls 
---------------------------------------------------------------------------------------+-------
- SELECT * FROM test_squash WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9)           |     1
- SELECT * FROM test_squash WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10)      |     1
- SELECT * FROM test_squash WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11) |     1
- SELECT pg_stat_statements_reset() IS NOT NULL AS t                                   |     1
-(4 rows)
-
 -- Normal scenario, too many simple constants for an IN query
-SET query_id_squash_values = on;
 SELECT pg_stat_statements_reset() IS NOT NULL AS t;
  t 
 ---
@@ -461,4 +429,3 @@ SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT pg_stat_statements_reset() IS NOT NULL AS t |     1
 (2 rows)
 
-RESET query_id_squash_values;
diff --git a/contrib/pg_stat_statements/sql/squashing.sql b/contrib/pg_stat_statements/sql/squashing.sql
index 56ee8ccb9a1..908be81ff2b 100644
--- a/contrib/pg_stat_statements/sql/squashing.sql
+++ b/contrib/pg_stat_statements/sql/squashing.sql
@@ -7,16 +7,7 @@ CREATE TABLE test_squash (id int, data int);
 
 -- IN queries
 
--- No squashing is performed, as a baseline result
-SELECT pg_stat_statements_reset() IS NOT NULL AS t;
-SELECT * FROM test_squash WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9);
-SELECT * FROM test_squash WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
-SELECT * FROM test_squash WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11);
-SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
-
 -- Normal scenario, too many simple constants for an IN query
-SET query_id_squash_values = on;
-
 SELECT pg_stat_statements_reset() IS NOT NULL AS t;
 SELECT * FROM test_squash WHERE id IN (1);
 SELECT * FROM test_squash WHERE id IN (1, 2, 3);
@@ -177,4 +168,3 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t;
 SELECT ARRAY[1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
 SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
 
-RESET query_id_squash_values;
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 69fc93dffc4..c51fc657042 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8718,36 +8718,6 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
-     <varlistentry id="guc-query-id-squash-values" xreflabel="query_id_squash_values">
-      <term><varname>query_id_squash_values</varname> (<type>bool</type>)
-      <indexterm>
-       <primary><varname>query_id_squash_values</varname> configuration parameter</primary>
-      </indexterm>
-      </term>
-      <listitem>
-       <para>
-        Specifies how a list of constants (e.g., for an <literal>IN</literal>
-        clause) contributes to the query identifier computation.
-        Normally, every element of such a list contributes to the query
-        identifier separately, which means that two queries that only differ
-        in the number of elements in such a list would get different query
-        identifiers.
-        If this parameter is on, a list of constants will not contribute
-        to the query identifier. This means that two queries whose only
-        difference is the number of constants in such a list are going to get the
-        same query identifier.
-       </para>
-       <para>
-        Only constants are affected; bind parameters do not benefit from this
-        functionality. The default value is <literal>off</literal>.
-       </para>
-       <para>
-        This parameter also affects how <xref linkend="pgstatstatements"/>
-        generates normalized query texts.
-       </para>
-      </listitem>
-     </varlistentry>
-
      <varlistentry id="guc-log-statement-stats">
       <term><varname>log_statement_stats</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index f4e384e95ae..4e9088da67e 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -632,13 +632,11 @@
    In some cases, queries with visibly different texts might get merged into a
    single <structname>pg_stat_statements</structname> entry; as explained above,
    this is expected to happen for semantically equivalent queries.
-   In addition, if <varname>query_id_squash_values</varname> is enabled
-   and the only difference between queries is the number of elements in a list
-   of constants, the list will get squashed down to a single element but shown
+   In addition, if the only difference between queries is the number of elements
+   in a list of constants, the list will get squashed down to a single element but shown
    with a commented-out list indicator:
 
 <screen>
-=# SET query_id_squash_values = on;
 =# SELECT pg_stat_statements_reset();
 =# SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5, 6, 7);
 =# SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5, 6, 7, 8);
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index f8b0f91704b..816a7fb2be7 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -46,9 +46,6 @@
 /* GUC parameters */
 int			compute_query_id = COMPUTE_QUERY_ID_AUTO;
 
-/* Whether to merge constants in a list when computing query_id */
-bool		query_id_squash_values = false;
-
 /*
  * True when compute_query_id is ON or AUTO, and a module requests them.
  *
@@ -303,7 +300,7 @@ IsSquashableConstList(List *elements, Node **firstExpr, Node **lastExpr)
 	 * If squashing is disabled, or the list is too short, we don't try to
 	 * squash it.
 	 */
-	if (!query_id_squash_values || list_length(elements) < 2)
+	if (list_length(elements) < 2)
 		return false;
 
 	foreach(temp, elements)
@@ -340,13 +337,10 @@ do { \
 #include "queryjumblefuncs.funcs.c"
 
 /*
- * When query_id_squash_values is enabled, we jumble lists of constant
- * elements as one individual item regardless of how many elements are
- * in the list.  This means different queries jumble to the same query_id,
- * if the only difference is the number of elements in the list.
- *
- * If query_id_squash_values is disabled or the list is not "simple
- * enough", we jumble each element normally.
+ * We jumble lists of constant elements as one individual item regardless
+ * of how many elements are in the list.  This means different queries
+ * jumble to the same query_id, if the only difference is the number of
+ * elements in the list.
  */
 static void
 _jumbleElements(JumbleState *jstate, List *elements)
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 9e6c900d146..bf6b55ee830 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -116,7 +116,6 @@ typedef struct
 	bool		redirection_done;
 	bool		IsBinaryUpgrade;
 	bool		query_id_enabled;
-	bool		query_id_squash_values;
 	int			max_safe_fds;
 	int			MaxBackends;
 	int			num_pmchild_slots;
@@ -778,7 +777,6 @@ save_backend_variables(BackendParameters *param,
 	param->redirection_done = redirection_done;
 	param->IsBinaryUpgrade = IsBinaryUpgrade;
 	param->query_id_enabled = query_id_enabled;
-	param->query_id_squash_values = query_id_squash_values;
 	param->max_safe_fds = max_safe_fds;
 
 	param->MaxBackends = MaxBackends;
@@ -1039,7 +1037,6 @@ restore_backend_variables(BackendParameters *param)
 	redirection_done = param->redirection_done;
 	IsBinaryUpgrade = param->IsBinaryUpgrade;
 	query_id_enabled = param->query_id_enabled;
-	query_id_squash_values = param->query_id_squash_values;
 	max_safe_fds = param->max_safe_fds;
 
 	MaxBackends = param->MaxBackends;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 989825d3a9c..76c7c6bb4b1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2122,16 +2122,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"query_id_squash_values", PGC_USERSET, STATS_MONITORING,
-			gettext_noop("Allows to merge constants in a list when computing "
-						 "query_id."),
-		},
-		&query_id_squash_values,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"vacuum_truncate", PGC_USERSET, VACUUM_DEFAULT,
 			gettext_noop("Enables vacuum to truncate empty pages at the end of the table."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 0b9e3066bde..1df670dca71 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -663,7 +663,6 @@
 # - Monitoring -
 
 #compute_query_id = auto
-#query_id_squash_values = off
 #log_statement_stats = off
 #log_parser_stats = off
 #log_planner_stats = off
diff --git a/src/include/nodes/queryjumble.h b/src/include/nodes/queryjumble.h
index 905f66bc0bd..47e9cb3d0f6 100644
--- a/src/include/nodes/queryjumble.h
+++ b/src/include/nodes/queryjumble.h
@@ -74,7 +74,6 @@ extern JumbleState *JumbleQuery(Query *query);
 extern void EnableQueryId(void);
 
 extern PGDLLIMPORT bool query_id_enabled;
-extern PGDLLIMPORT bool query_id_squash_values;
 
 /*
  * Returns whether query identifier computation has been enabled, either
-- 
2.47.1



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
  2025-03-25 16:53   ` Re: Squash constant lists in query jumbling by default Tom Lane <[email protected]>
  2025-03-25 17:27     ` Re: Squash constant lists in query jumbling by default Álvaro Herrera <[email protected]>
  2025-03-25 18:40       ` Re: Squash constant lists in query jumbling by default Dmitry Dolgov <[email protected]>
@ 2025-03-25 20:05         ` Dmitry Dolgov <[email protected]>
  1 sibling, 0 replies; 10+ messages in thread

From: Dmitry Dolgov @ 2025-03-25 20:05 UTC (permalink / raw)
  To: Álvaro Herrera <[email protected]>; +Cc: Tom Lane <[email protected]>; Christoph Berg <[email protected]>; PostgreSQL Developers <[email protected]>; Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Julien Rouhaud <[email protected]>

On Tue, Mar 25, 2025, 7:40 PM Dmitry Dolgov <[email protected]> wrote:

>
>>> On Tue, Mar 25, 2025, 6:28 PM Álvaro Herrera <[email protected]>
> wrote:
>
> On 2025-Mar-25, Tom Lane wrote:
>
> > If this GUC sticks around, it should be at least PGC_SUSET (on
>
> > the analogy of compute_query_id) to make it harder to break
>
> > pg_stat_statements that way.
>
>
> I have no problem making it superuser-only, and I can see making "on" be
>
> the default.  I am not opposed to removing it completely either, if we
>
> really think that the current behavior is no longer useful for anybody.
>
>
> I'm in favor of removing the GUC of course, but if memory serves there
> were some folks in the patch discussion thread, who claimed they would
> need to be able to keep non-squashed behavior. I don't recall if there were
> particular arguments to support that, will try to find those messages
> again.
>

Nevermind, I've checked it out -- I think the case I had in mind [1] in fact
supports GUC removal:

> If anyone subtly changes jumbling logic when the extension is
active, the instance could get huge performance issues.

[1]:
https://www.postgresql.org/message-id/b8721722-a73e-0ee9-6513-425e9c88d92f%40postgrespro.ru

>


^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
@ 2025-03-25 21:24   ` Laurenz Albe <[email protected]>
  2025-03-27 08:47     ` Re: Squash constant lists in query jumbling by default Álvaro Herrera <[email protected]>
  1 sibling, 1 reply; 10+ messages in thread

From: Laurenz Albe @ 2025-03-25 21:24 UTC (permalink / raw)
  To: Christoph Berg <[email protected]>; [email protected]; +Cc: Álvaro Herrera <[email protected]>; Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Tom Lane <[email protected]>; Julien Rouhaud <[email protected]>

On Tue, 2025-03-25 at 17:28 +0100, Christoph Berg wrote:
> The "jumble names of temp tables" thread was briefly touching this [1],
> I'm starting a new thread since the others are already very long.
> 
> [1] https://www.postgresql.org/message-id/flat/CAA5RZ0uNofEXfEfNw3uRN3D3oXkFPQ_s%2BhuLLHMKR_%2BMCk8RPQ%4...
> 
> Two points were made:
> 
> 1) this should be on by default
> 2) there should be no GUC for it.

+1 on both

Yours,
Laurenz Albe





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Squash constant lists in query jumbling by default
  2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
  2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
  2025-03-25 21:24   ` Re: Squash constant lists in query jumbling by default Laurenz Albe <[email protected]>
@ 2025-03-27 08:47     ` Álvaro Herrera <[email protected]>
  0 siblings, 0 replies; 10+ messages in thread

From: Álvaro Herrera @ 2025-03-27 08:47 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Christoph Berg <[email protected]>; [email protected]; Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Tom Lane <[email protected]>; Julien Rouhaud <[email protected]>

On 2025-Mar-25, Laurenz Albe wrote:

> On Tue, 2025-03-25 at 17:28 +0100, Christoph Berg wrote:
> > The "jumble names of temp tables" thread was briefly touching this [1],
> > I'm starting a new thread since the others are already very long.
> > 
> > [1] https://www.postgresql.org/message-id/flat/CAA5RZ0uNofEXfEfNw3uRN3D3oXkFPQ_s%2BhuLLHMKR_%2BMCk8RPQ%4...
> > 
> > Two points were made:
> > 
> > 1) this should be on by default
> > 2) there should be no GUC for it.
> 
> +1 on both

Well, the votes are quite clear, so I have pushed Sami's patch.

Thank, everybody!

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"That sort of implies that there are Emacs keystrokes which aren't obscure.
I've been using it daily for 2 years now and have yet to discover any key
sequence which makes any sense."                        (Paul Thomas)






^ permalink  raw  reply  [nested|flat] 10+ messages in thread


end of thread, other threads:[~2025-03-27 08:47 UTC | newest]

Thread overview: 10+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-03-18 17:58 pgsql: Introduce squashing of constant lists in query jumbling Álvaro Herrera <[email protected]>
2025-03-25 16:28 ` Squash constant lists in query jumbling by default Christoph Berg <[email protected]>
2025-03-25 16:53   ` Re: Squash constant lists in query jumbling by default Tom Lane <[email protected]>
2025-03-25 17:27     ` Re: Squash constant lists in query jumbling by default Álvaro Herrera <[email protected]>
2025-03-25 17:55       ` Re: Squash constant lists in query jumbling by default Sami Imseih <[email protected]>
2025-03-25 18:40       ` Re: Squash constant lists in query jumbling by default Dmitry Dolgov <[email protected]>
2025-03-25 19:33         ` Re: Squash constant lists in query jumbling by default Sami Imseih <[email protected]>
2025-03-25 20:05         ` Re: Squash constant lists in query jumbling by default Dmitry Dolgov <[email protected]>
2025-03-25 21:24   ` Re: Squash constant lists in query jumbling by default Laurenz Albe <[email protected]>
2025-03-27 08:47     ` Re: Squash constant lists in query jumbling by default Álvaro Herrera <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox