Re: query_id: jumble names of temp tables for better pg_stat

public inbox for [email protected]  
help / color / mirror / Atom feed

Re: query_id: jumble names of temp tables for better pg_stat_statement UX
7+ messages / 4 participants
[nested] [flat]

* Re: query_id: jumble names of temp tables for better pg_stat_statement UX
@ 2025-03-23 00:47  Michael Paquier <[email protected]>
  0 siblings, 1 reply; 7+ messages in thread

From: Michael Paquier @ 2025-03-23 00:47 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Christoph Berg <[email protected]>; PostgreSQL Hackers <[email protected]>; ma lz <[email protected]>

On Sat, Mar 22, 2025 at 12:24:43PM -0400, Tom Lane wrote:
> I experimented with this trivial fix (shown in-line to keep the cfbot
> from thinking this is the patch-of-record):
> 
> What's happening there is that there's an ALTER TABLE ADD COLUMN in
> the test, so the executions after the first one see more entries
> in eref->colnames and come up with a different jumble.  I think
> we probably don't want that behavior; we only want to jumble the
> table name.  So we'd still need the v3-0001 patch in some form to
> allow annotating RangeTblEntry.eref with a custom jumble method
> that'd only jumble the aliasname.

Alias.aliasname is not qualified, so it means that we'd begin to
assign the same query ID even if using two relations from two schemas
depending on what search_path assigns, no?  Say:
create schema popo1;
create schema popo2;
create table popo1.aa (a int, b int);
create table popo2.aa (a int, b int);
set search_path = 'popo1';
select count(*) from aa;
set search_path = 'popo2';
select count(*) from aa;

=# select query, calls from pg_stat_statements where
     query ~ 'select count';
          query          | calls
-------------------------+-------
 select count(*) from aa |     2
(1 row)

Perhaps that's OK because such queries use the same query string, but
just silencing the relid means that we'd lose the namespace reference
entirely, making the stats potentially fuzzier depending on the
workload.  On HEAD, one can guess the query ID with an EXPLAIN and a
search_path, as well, so currently it's possible to cross-check the 
contents of pgss.  But we'd lose this possibility here..
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: query_id: jumble names of temp tables for better pg_stat_statement UX
@ 2025-03-23 01:04  Tom Lane <[email protected]>
  parent: Michael Paquier <[email protected]>
  0 siblings, 1 reply; 7+ messages in thread

From: Tom Lane @ 2025-03-23 01:04 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Christoph Berg <[email protected]>; PostgreSQL Hackers <[email protected]>; ma lz <[email protected]>

Michael Paquier <[email protected]> writes:
> Alias.aliasname is not qualified, so it means that we'd begin to
> assign the same query ID even if using two relations from two schemas
> depending on what search_path assigns, no?

Right.  I'm arguing that that's good.  The proposed patch already
obscures the difference between similar table names in different
(temp) schemas, and I'm suggesting that taking that a bit further
would be fine.

Note that if the tables we're considering don't have identical
rowtypes, the queries would likely jumble differently anyway
due to differences in Vars' varattno and vartype.

			regards, tom lane

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: query_id: jumble names of temp tables for better pg_stat_statement UX
@ 2025-03-23 06:38  Michael Paquier <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 2 replies; 7+ messages in thread

From: Michael Paquier @ 2025-03-23 06:38 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Christoph Berg <[email protected]>; PostgreSQL Hackers <[email protected]>; ma lz <[email protected]>

On Sat, Mar 22, 2025 at 09:04:19PM -0400, Tom Lane wrote:
> Right.  I'm arguing that that's good.  The proposed patch already
> obscures the difference between similar table names in different
> (temp) schemas, and I'm suggesting that taking that a bit further
> would be fine.
> 
> Note that if the tables we're considering don't have identical
> rowtypes, the queries would likely jumble differently anyway
> due to differences in Vars' varattno and vartype.

Not for the types AFAIK, the varattnos count in, but perhaps for the
same argument as previously it's just kind of OK?  Please see the
tests in the attached about that.

I've spent a few hours looking at the diffs of a pgss dump before and 
after the fact.  The reduction in the number of entries seem to come
mainly from tests where we do a successive creates and drops of the
same table name.  There are quite a few of them for updatable views,
but it's not the only one.  A good chunk comes from tables with
simpler and rather generic names.

So your idea to use the relation name in eref while skipping the
column list looks kind of promising.  Per se the attached.  Thoughts?
--
Michael


Attachments:

  [text/x-diff] v4-0001-Add-support-for-custom_query_jumble-at-field-leve.patch (4.0K, 2-v4-0001-Add-support-for-custom_query_jumble-at-field-leve.patch)
  download | inline diff:
From fd2bc0e26134b7d74fe33da99eab6623a9859f22 Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Sun, 23 Mar 2025 15:33:05 +0900
Subject: [PATCH v4 1/2] Add support for custom_query_jumble at field-level

This option gives the possibility for query jumbling to force custom
jumbling policies specific to a field of a node.  When dealing with
complex node structures, this is simpler than having to enforce a custom
function across the full node.

Custom functions need to be defined as _jumble${node}_${field}, are
given in input the parent node and the field value.  The field value is
not really required if we have the parent Node, but it makes custom
implementations easier to follow with field-specific definitions and
values.  The code generated by gen_node_support.pl uses a macro called
JUMBLE_CUSTOM(), that hides the objects specific to queryjumblefuncs.c.
---
 src/include/nodes/nodes.h             |  4 ++++
 src/backend/nodes/gen_node_support.pl | 15 +++++++++++++--
 src/backend/nodes/queryjumblefuncs.c  |  6 ++++++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d18044b4e650..adec68a45ab8 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -54,6 +54,7 @@ typedef enum NodeTag
  *   readfuncs.c.
  *
  * - custom_query_jumble: Has custom implementation in queryjumblefuncs.c.
+ *   Available as well as a per-field attribute.
  *
  * - no_copy: Does not support copyObject() at all.
  *
@@ -101,6 +102,9 @@ typedef enum NodeTag
  * - equal_ignore_if_zero: Ignore the field for equality if it is zero.
  *   (Otherwise, compare normally.)
  *
+ * - custom_query_jumble: Has custom implementation in queryjumblefuncs.c
+ *   applied for the field of a node.  Available as well as a node attribute.
+ *
  * - query_jumble_ignore: Ignore the field for the query jumbling.  Note
  *   that typmod and collation information are usually irrelevant for the
  *   query jumbling.
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7e3f335ac09d..2d62ecb1d9d6 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -471,6 +471,7 @@ foreach my $infile (@ARGV)
 								&& $attr !~ /^read_as\(\w+\)$/
 								&& !elem $attr,
 								qw(copy_as_scalar
+								custom_query_jumble
 								equal_as_scalar
 								equal_ignore
 								equal_ignore_if_zero
@@ -1283,12 +1284,17 @@ _jumble${n}(JumbleState *jstate, Node *node)
 		my $t = $node_type_info{$n}->{field_types}{$f};
 		my @a = @{ $node_type_info{$n}->{field_attrs}{$f} };
 		my $query_jumble_ignore = $struct_no_query_jumble;
+		my $query_jumble_custom = 0;
 		my $query_jumble_location = 0;
 		my $query_jumble_squash = 0;
 
 		# extract per-field attributes
 		foreach my $a (@a)
 		{
+			if ($a eq 'custom_query_jumble')
+			{
+				$query_jumble_custom = 1;
+			}
 			if ($a eq 'query_jumble_ignore')
 			{
 				$query_jumble_ignore = 1;
@@ -1304,8 +1310,13 @@ _jumble${n}(JumbleState *jstate, Node *node)
 		}
 
 		# node type
-		if (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
-			and elem $1, @node_types)
+		if ($query_jumble_custom)
+		{
+			# Custom function that applies to one field of a node.
+			print $jff "\tJUMBLE_CUSTOM($n, $f);\n";
+		}
+		elsif (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
+			   and elem $1, @node_types)
 		{
 			# Squash constants if requested.
 			if ($query_jumble_squash)
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 189bfda610aa..9b9cf6f34381 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -333,6 +333,12 @@ do { \
 	if (expr->str) \
 		AppendJumble(jstate, (const unsigned char *) (expr->str), strlen(expr->str) + 1); \
 } while(0)
+/*
+ * Note that the argument types are enforced for the per-field custom
+ * functions.
+ */
+#define JUMBLE_CUSTOM(nodetype, item) \
+	_jumble##nodetype##_##item(jstate, expr, expr->item)
 
 #include "queryjumblefuncs.funcs.c"
 
-- 
2.49.0



  [text/x-diff] v4-0002-Add-custom-query-jumble-function-for-RangeTblEntr.patch (10.1K, 3-v4-0002-Add-custom-query-jumble-function-for-RangeTblEntr.patch)
  download | inline diff:
From e91d0ff80c3670d2a7656e1711581c4df2a58c21 Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Sun, 23 Mar 2025 15:34:29 +0900
Subject: [PATCH v4 2/2] Add custom query jumble function for
 RangeTblEntry.eref

---
 src/include/nodes/parsenodes.h                |  10 +-
 src/backend/nodes/queryjumblefuncs.c          |  20 ++
 .../pg_stat_statements/expected/select.out    | 191 ++++++++++++++++++
 contrib/pg_stat_statements/sql/select.sql     |  57 ++++++
 4 files changed, 275 insertions(+), 3 deletions(-)

diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf2..36363de3a769 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1050,8 +1050,12 @@ typedef struct RangeTblEntry
 	 */
 	/* user-written alias clause, if any */
 	Alias	   *alias pg_node_attr(query_jumble_ignore);
-	/* expanded reference names */
-	Alias	   *eref pg_node_attr(query_jumble_ignore);
+	/*
+	 * Expanded reference names.  This uses a custom query jumble function
+	 * so as the table name is included in the computation, not its list
+	 * of columns.
+	 */
+	Alias	   *eref pg_node_attr(custom_query_jumble);
 
 	RTEKind		rtekind;		/* see above */
 
@@ -1094,7 +1098,7 @@ typedef struct RangeTblEntry
 	 * tables to be invalidated if the underlying table is altered.
 	 */
 	/* OID of the relation */
-	Oid			relid;
+	Oid			relid pg_node_attr(query_jumble_ignore);
 	/* inheritance requested? */
 	bool		inh;
 	/* relation kind (see pg_class.relkind) */
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 9b9cf6f34381..896bdc663a61 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -33,6 +33,7 @@
 #include "postgres.h"
 
 #include "access/transam.h"
+#include "catalog/namespace.h"
 #include "catalog/pg_proc.h"
 #include "common/hashfn.h"
 #include "miscadmin.h"
@@ -67,6 +68,9 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
 static void _jumbleA_Const(JumbleState *jstate, Node *node);
 static void _jumbleList(JumbleState *jstate, Node *node);
 static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry_eref(JumbleState *jstate,
+									  RangeTblEntry *rte,
+									  Alias *expr);
 
 /*
  * Given a possibly multi-statement source string, confine our attention to the
@@ -519,3 +523,19 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
 	JUMBLE_FIELD(is_local);
 	JUMBLE_LOCATION(location);
 }
+
+/*
+ * Custom query jumble function for RangeTblEntry.eref.
+ */
+static void
+_jumbleRangeTblEntry_eref(JumbleState *jstate,
+						  RangeTblEntry *rte,
+						  Alias *expr)
+{
+	JUMBLE_FIELD(type);
+	/*
+	 * This includes only the table name, the list of column names is
+	 * ignored.
+	 */
+	JUMBLE_STRING(aliasname);
+}
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 37a30af034a6..218717b9a765 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -413,3 +413,194 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t;
  t
 (1 row)
 
+-- Temporary tables, grouped together.
+BEGIN;
+  CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+  SELECT * FROM temp_t;
+ id 
+----
+(0 rows)
+
+COMMIT;
+BEGIN;
+  CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+  SELECT * FROM temp_t;
+ id 
+----
+(0 rows)
+
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls |                                 query                                  
+-------+------------------------------------------------------------------------
+     2 | SELECT * FROM temp_t
+     0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+     1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(3 rows)
+
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
+(1 row)
+
+-- search_path with various schemas and temporary tables
+CREATE SCHEMA pgss_schema_1;
+CREATE SCHEMA pgss_schema_2;
+-- Same attributes.
+CREATE TABLE pgss_schema_1.tab_search_same (a int, b int);
+CREATE TABLE pgss_schema_2.tab_search_same (a int, b int);
+CREATE TEMP TABLE tab_search_same (a int, b int);
+-- Different number of attributes, mapping types
+CREATE TABLE pgss_schema_1.tab_search_diff_1 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_1 (a int, b int);
+CREATE TEMP TABLE tab_search_diff_1 (a int, b int, c int);
+-- Same number of attributes, different types
+CREATE TABLE pgss_schema_1.tab_search_diff_2 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_2 (a text);
+CREATE TEMP TABLE tab_search_diff_2 (a bigint);
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+ count 
+-------
+     0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b 
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count 
+-------
+     0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count 
+-------
+     0
+(1 row)
+
+SELECT a FROM tab_search_diff_2;
+ a 
+---
+(0 rows)
+
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+ count 
+-------
+     0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b 
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count 
+-------
+     0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count 
+-------
+     0
+(1 row)
+
+SELECT a FROM tab_search_diff_2;
+ a 
+---
+(0 rows)
+
+SET search_path = 'pg_temp';
+SELECT count(*) FROM tab_search_same;
+ count 
+-------
+     0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b 
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count 
+-------
+     0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count 
+-------
+     0
+(1 row)
+
+SELECT a FROM tab_search_diff_2;
+ a 
+---
+(0 rows)
+
+RESET search_path;
+SELECT count(*) FROM pgss_schema_1.tab_search_same;
+ count 
+-------
+     0
+(1 row)
+
+SELECT a, b FROM pgss_schema_1.tab_search_same;
+ a | b 
+---+---
+(0 rows)
+
+SELECT count(*) FROM pgss_schema_2.tab_search_diff_1;
+ count 
+-------
+     0
+(1 row)
+
+SELECT count(*) FROM pg_temp.tab_search_diff_2;
+ count 
+-------
+     0
+(1 row)
+
+SELECT a FROM pgss_schema_2.tab_search_diff_2;
+ a 
+---
+(0 rows)
+
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls |                                 query                                  
+-------+------------------------------------------------------------------------
+     4 | SELECT a FROM tab_search_diff_2
+     4 | SELECT a, b FROM tab_search_same
+     0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+     4 | SELECT count(*) FROM tab_search_diff_1
+     4 | SELECT count(*) FROM tab_search_diff_2
+     4 | SELECT count(*) FROM tab_search_same
+     1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(7 rows)
+
+DROP SCHEMA pgss_schema_1 CASCADE;
+NOTICE:  drop cascades to 3 other objects
+DETAIL:  drop cascades to table pgss_schema_1.tab_search_same
+drop cascades to table pgss_schema_1.tab_search_diff_1
+drop cascades to table pgss_schema_1.tab_search_diff_2
+DROP SCHEMA pgss_schema_2 CASCADE;
+NOTICE:  drop cascades to 3 other objects
+DETAIL:  drop cascades to table pgss_schema_2.tab_search_same
+drop cascades to table pgss_schema_2.tab_search_diff_1
+drop cascades to table pgss_schema_2.tab_search_diff_2
+DROP TABLE tab_search_same, tab_search_diff_1, tab_search_diff_2;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
+(1 row)
+
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index e0be58d5e24b..64603f85a411 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -148,3 +148,60 @@ SELECT (
 
 SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%SELECT GROUPING%';
 SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+
+-- Temporary tables, grouped together.
+BEGIN;
+  CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+  SELECT * FROM temp_t;
+COMMIT;
+BEGIN;
+  CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+  SELECT * FROM temp_t;
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+
+-- search_path with various schemas and temporary tables
+CREATE SCHEMA pgss_schema_1;
+CREATE SCHEMA pgss_schema_2;
+-- Same attributes.
+CREATE TABLE pgss_schema_1.tab_search_same (a int, b int);
+CREATE TABLE pgss_schema_2.tab_search_same (a int, b int);
+CREATE TEMP TABLE tab_search_same (a int, b int);
+-- Different number of attributes, mapping types
+CREATE TABLE pgss_schema_1.tab_search_diff_1 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_1 (a int, b int);
+CREATE TEMP TABLE tab_search_diff_1 (a int, b int, c int);
+-- Same number of attributes, different types
+CREATE TABLE pgss_schema_1.tab_search_diff_2 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_2 (a text);
+CREATE TEMP TABLE tab_search_diff_2 (a bigint);
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2;
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2;
+SET search_path = 'pg_temp';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2;
+RESET search_path;
+SELECT count(*) FROM pgss_schema_1.tab_search_same;
+SELECT a, b FROM pgss_schema_1.tab_search_same;
+SELECT count(*) FROM pgss_schema_2.tab_search_diff_1;
+SELECT count(*) FROM pg_temp.tab_search_diff_2;
+SELECT a FROM pgss_schema_2.tab_search_diff_2;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+DROP SCHEMA pgss_schema_1 CASCADE;
+DROP SCHEMA pgss_schema_2 CASCADE;
+DROP TABLE tab_search_same, tab_search_diff_1, tab_search_diff_2;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
-- 
2.49.0



  [application/pgp-signature] signature.asc (833B, 4-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: query_id: jumble names of temp tables for better pg_stat_statement UX
@ 2025-03-24 15:41  Christoph Berg <[email protected]>
  parent: Michael Paquier <[email protected]>
  1 sibling, 1 reply; 7+ messages in thread

From: Christoph Berg @ 2025-03-24 15:41 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Tom Lane <[email protected]>; PostgreSQL Hackers <[email protected]>; ma lz <[email protected]>

Re: Michael Paquier
> So your idea to use the relation name in eref while skipping the
> column list looks kind of promising.  Per se the attached.  Thoughts?

Makes sense to me, thanks for digging into it.

> +++ b/src/backend/nodes/queryjumblefuncs.c
> @@ -33,6 +33,7 @@
>  #include "postgres.h"
>  
>  #include "access/transam.h"
> +#include "catalog/namespace.h"

No longer needed.

> +++ b/contrib/pg_stat_statements/sql/select.sql
> +SET search_path = 'pgss_schema_1';
> +SELECT count(*) FROM tab_search_same;
> +SELECT a, b FROM tab_search_same;
> +SELECT count(*) FROM tab_search_diff_1;
> +SELECT count(*) FROM tab_search_diff_2;
> +SELECT a FROM tab_search_diff_2;
> +SET search_path = 'pgss_schema_1';

Should this be pgss_schema_2 ?

Christoph






^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: query_id: jumble names of temp tables for better pg_stat_statement UX
@ 2025-03-25 01:51  Sami Imseih <[email protected]>
  parent: Michael Paquier <[email protected]>
  1 sibling, 1 reply; 7+ messages in thread

From: Sami Imseih @ 2025-03-25 01:51 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Tom Lane <[email protected]>; Christoph Berg <[email protected]>; PostgreSQL Hackers <[email protected]>; ma lz <[email protected]>

> So your idea to use the relation name in eref while skipping the
> column list looks kind of promising.  Per se the attached.  Thoughts?

I feel really uneasy about this behavior becoming the default.
I can bet there are some users which run common queries across
different schemas ( e.g. multi-tenancy ) will consider this behavior a
regression
in pg_stat_statements as now all their common queries have been merged
into a single entry.

For example, I have seen users add comments to SQLs to differentiate
similar SQLs coming from different tenants. This patch makes this no longer a
somewhat decent workaround  to overcome the fact that pg_stat_statements
does not track schemas or search path.

```
select pg_stat_statements_reset();

set search_path = s1;
select /*+ user s1 */ * from foo;

set search_path = s2;
select /*+ user s2 */  * from foo;

reset search_path;
select userid, queryid, query, calls from public.pg_stat_statements;

test=# select userid, queryid, query, calls from public.pg_stat_statements;
 userid |       queryid        |               query               | calls
--------+----------------------+-----------------------------------+-------
     10 |  1788423388555345932 | select /*+ user s1 */ * from foo  |     2
     10 | -8935568138104064674 | select pg_stat_statements_reset() |     1
     10 | -8663970364987885379 | set search_path = $1              |     2
     10 | -6563543739552933350 | reset search_path                 |     1
(4 rows)
```

--
Sami Imseih
Amazon Web Services (AWS)






^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: query_id: jumble names of temp tables for better pg_stat_statement UX
@ 2025-03-25 02:27  Tom Lane <[email protected]>
  parent: Sami Imseih <[email protected]>
  0 siblings, 0 replies; 7+ messages in thread

From: Tom Lane @ 2025-03-25 02:27 UTC (permalink / raw)
  To: Sami Imseih <[email protected]>; +Cc: Michael Paquier <[email protected]>; Christoph Berg <[email protected]>; PostgreSQL Hackers <[email protected]>; ma lz <[email protected]>

Sami Imseih <[email protected]> writes:
> For example, I have seen users add comments to SQLs to differentiate
> similar SQLs coming from different tenants. This patch makes this no longer a
> somewhat decent workaround  to overcome the fact that pg_stat_statements
> does not track schemas or search path.

Well, the workaround is different, but that doesn't mean there is no
workaround.  You just need to alter a table alias in the query:

select * from foo s1;
select * from foo s2;

As against this, there is probably also a set of people who would
*like* identical queries on identically-rowtyped tables in different
schemas to be merged.  Right now they have no way to make that happen.

So yeah, it's a nontrivial behavioral change.  But I think on the
whole it's likely to be more useful.  We could always revert the
change during beta if we get too much pushback.

			regards, tom lane

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: query_id: jumble names of temp tables for better pg_stat_statement UX
@ 2025-03-25 03:47  Michael Paquier <[email protected]>
  parent: Christoph Berg <[email protected]>
  0 siblings, 0 replies; 7+ messages in thread

From: Michael Paquier @ 2025-03-25 03:47 UTC (permalink / raw)
  To: Christoph Berg <[email protected]>; +Cc: Tom Lane <[email protected]>; PostgreSQL Hackers <[email protected]>; ma lz <[email protected]>

On Mon, Mar 24, 2025 at 04:41:35PM +0100, Christoph Berg wrote:
> Re: Michael Paquier
>> +++ b/src/backend/nodes/queryjumblefuncs.c
>> @@ -33,6 +33,7 @@
>>  #include "postgres.h"
>>  
>>  #include "access/transam.h"
>> +#include "catalog/namespace.h"
> 
> No longer needed.

Indeed.

>> +++ b/contrib/pg_stat_statements/sql/select.sql
>> +SET search_path = 'pgss_schema_1';
>> +SELECT count(*) FROM tab_search_same;
>> +SELECT a, b FROM tab_search_same;
>> +SELECT count(*) FROM tab_search_diff_1;
>> +SELECT count(*) FROM tab_search_diff_2;
>> +SELECT a FROM tab_search_diff_2;
>> +SET search_path = 'pgss_schema_1';
> 
> Should this be pgss_schema_2 ?

Oops.  Nice copy-paste.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

end of thread, other threads:[~2025-03-25 03:47 UTC | newest]

Thread overview: 7+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-03-23 00:47 Re: query_id: jumble names of temp tables for better pg_stat_statement UX Michael Paquier <[email protected]>
2025-03-23 01:04 ` Tom Lane <[email protected]>
2025-03-23 06:38   ` Michael Paquier <[email protected]>
2025-03-24 15:41     ` Christoph Berg <[email protected]>
2025-03-25 03:47       ` Michael Paquier <[email protected]>
2025-03-25 01:51     ` Sami Imseih <[email protected]>
2025-03-25 02:27       ` Tom Lane <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox