public inbox for [email protected]
help / color / mirror / Atom feedFrom: Michael Paquier <[email protected]>
To: Christoph Berg <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: ma lz <[email protected]>
Subject: Re: query_id: jumble names of temp tables for better pg_stat_statement UX
Date: Fri, 21 Mar 2025 09:54:55 +0900
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <OSZP286MB2253413D25A5D93BEA4F6459F2432@OSZP286MB2253.JPNP286.PROD.OUTLOOK.COM>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
On Wed, Mar 19, 2025 at 01:02:54PM +0100, Christoph Berg wrote:
> You are of course right, that one-line comment was just snakeoil :D.
> Now there are proper ones, thanks.
I have been thinking about this patch for a couple of days. What
makes me unhappy in this proposal is that RangeTblEntry is a large and
complicated Node, and we only want to force a custom computation for
the "relid" portion of the node, while being able to also take some
decisions for what the parent node has. Leaving the existing
per-field query_jumble_ignore with the custom function is prone to
errors, as well, because we need to maintain some level of consistency
between parsenodes.h and src/backend/nodes/.
Hence here is a counter-proposal, that can bring the best of both
worlds: let's add support for custom_query_jumble at field level.
I've spent some time on that, and some concatenation in a macro used
by gen_node_support.pl to force a policy for the custom function name
and its arguments is proving to be rather straight-forward. This
approach addresses the problem of this thread, while also tackling my
concerns about complex node structures.
The custom functions are named _jumble${node}_${field}, with the field
and the parent node given as arguments. I agree that giving the field
is kind of pointless if you have the parent node, but I think that
this is better so as this forces developers to think about how to use
the field value with the node.
Please see the attached. What do you think?
--
Michael
Attachments:
[text/x-diff] v3-0001-Add-support-for-custom_query_jumble-at-field-leve.patch (3.8K, 2-v3-0001-Add-support-for-custom_query_jumble-at-field-leve.patch)
download | inline diff:
From 11e3154cabdc24feb14e91d35c0cfee5f6c0ca2c Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Fri, 21 Mar 2025 09:35:38 +0900
Subject: [PATCH v3 1/2] Add support for custom_query_jumble at field-level
This option gives the possibility for query jumbling to force custom
jumbling policies specific to a field of a node. When dealing with
complex node structures, this is simpler than having to enforce a custom
function across the full node.
Custom functions need to be defined as _jumble${node}_${field}, are
given in input the parent node and the field value. The field value is
not really required if we have the parent Node, but it makes custom
implementations easier to follow with field-specific definitions and
values. The code generated by gen_node_support.pl uses a macro called
JUMBLE_CUSTOM(), that hides the objects specific to queryjumblefuncs.c.
---
src/include/nodes/nodes.h | 4 ++++
src/backend/nodes/gen_node_support.pl | 11 +++++++++++
src/backend/nodes/queryjumblefuncs.c | 6 ++++++
3 files changed, 21 insertions(+)
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d18044b4e650..adec68a45ab8 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -54,6 +54,7 @@ typedef enum NodeTag
* readfuncs.c.
*
* - custom_query_jumble: Has custom implementation in queryjumblefuncs.c.
+ * Available as well as a per-field attribute.
*
* - no_copy: Does not support copyObject() at all.
*
@@ -101,6 +102,9 @@ typedef enum NodeTag
* - equal_ignore_if_zero: Ignore the field for equality if it is zero.
* (Otherwise, compare normally.)
*
+ * - custom_query_jumble: Has custom implementation in queryjumblefuncs.c
+ * applied for the field of a node. Available as well as a node attribute.
+ *
* - query_jumble_ignore: Ignore the field for the query jumbling. Note
* that typmod and collation information are usually irrelevant for the
* query jumbling.
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7e3f335ac09d..bcae70cea7d4 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -471,6 +471,7 @@ foreach my $infile (@ARGV)
&& $attr !~ /^read_as\(\w+\)$/
&& !elem $attr,
qw(copy_as_scalar
+ custom_query_jumble
equal_as_scalar
equal_ignore
equal_ignore_if_zero
@@ -1283,12 +1284,17 @@ _jumble${n}(JumbleState *jstate, Node *node)
my $t = $node_type_info{$n}->{field_types}{$f};
my @a = @{ $node_type_info{$n}->{field_attrs}{$f} };
my $query_jumble_ignore = $struct_no_query_jumble;
+ my $query_jumble_custom = 0;
my $query_jumble_location = 0;
my $query_jumble_squash = 0;
# extract per-field attributes
foreach my $a (@a)
{
+ if ($a eq 'custom_query_jumble')
+ {
+ $query_jumble_custom = 1;
+ }
if ($a eq 'query_jumble_ignore')
{
$query_jumble_ignore = 1;
@@ -1328,6 +1334,11 @@ _jumble${n}(JumbleState *jstate, Node *node)
unless $query_jumble_ignore;
}
}
+ elsif ($query_jumble_custom)
+ {
+ # Custom function that applies to one field of a node.
+ print $jff "\tJUMBLE_CUSTOM($n, $f);\n";
+ }
elsif ($t eq 'char*')
{
print $jff "\tJUMBLE_STRING($f);\n"
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 189bfda610aa..9b9cf6f34381 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -333,6 +333,12 @@ do { \
if (expr->str) \
AppendJumble(jstate, (const unsigned char *) (expr->str), strlen(expr->str) + 1); \
} while(0)
+/*
+ * Note that the argument types are enforced for the per-field custom
+ * functions.
+ */
+#define JUMBLE_CUSTOM(nodetype, item) \
+ _jumble##nodetype##_##item(jstate, expr, expr->item)
#include "queryjumblefuncs.funcs.c"
--
2.49.0
[text/x-diff] v3-0002-Add-custom-query-jumble-function-for-RangeTblEntr.patch (5.0K, 3-v3-0002-Add-custom-query-jumble-function-for-RangeTblEntr.patch)
download | inline diff:
From 1c5b1a99aee4d0872d42b6edfdaab1266d13a522 Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Fri, 21 Mar 2025 09:43:48 +0900
Subject: [PATCH v3 2/2] Add custom query jumble function for
RangeTblEntry.relid
---
src/include/nodes/parsenodes.h | 9 +++--
src/backend/nodes/queryjumblefuncs.c | 27 +++++++++++++++
.../pg_stat_statements/expected/utility.out | 33 +++++++++++++++++++
contrib/pg_stat_statements/sql/utility.sql | 12 +++++++
4 files changed, 79 insertions(+), 2 deletions(-)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf2..3eb16846e0e1 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1093,8 +1093,13 @@ typedef struct RangeTblEntry
* relation. This allows plans referencing AFTER trigger transition
* tables to be invalidated if the underlying table is altered.
*/
- /* OID of the relation */
- Oid relid;
+
+ /*
+ * OID of the relation. A custom query jumble function is used here for
+ * temporary tables, where the computation uses the relation name instead
+ * of the OID.
+ */
+ Oid relid pg_node_attr(custom_query_jumble);
/* inheritance requested? */
bool inh;
/* relation kind (see pg_class.relkind) */
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 9b9cf6f34381..fbb05eab1bbe 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -33,6 +33,7 @@
#include "postgres.h"
#include "access/transam.h"
+#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "common/hashfn.h"
#include "miscadmin.h"
@@ -67,6 +68,9 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry_relid(JumbleState *jstate,
+ RangeTblEntry *expr,
+ Oid relid);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -519,3 +523,26 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+/*
+ * Custom query jumble function for RangeTblEntry.relid.
+ */
+static void
+_jumbleRangeTblEntry_relid(JumbleState *jstate,
+ RangeTblEntry *expr,
+ Oid relid)
+{
+ /*
+ * If this is a temporary table, jumble its name instead of the table OID.
+ */
+ if (expr->rtekind == RTE_RELATION &&
+ isAnyTempNamespace(get_rel_namespace(expr->relid)))
+ {
+ char *relname = get_rel_name(expr->relid);
+
+ AppendJumble(jstate, (const unsigned char *) "pg_temp", sizeof("pg_temp"));
+ AppendJumble(jstate, (const unsigned char *) relname, strlen(relname));
+ }
+ else
+ JUMBLE_FIELD(relid);
+}
diff --git a/contrib/pg_stat_statements/expected/utility.out b/contrib/pg_stat_statements/expected/utility.out
index aa4f0f7e6280..941ba0e66deb 100644
--- a/contrib/pg_stat_statements/expected/utility.out
+++ b/contrib/pg_stat_statements/expected/utility.out
@@ -9,6 +9,39 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
(1 row)
+-- Temporary tables, grouped together.
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls | query
+-------+----------------------------------------------------
+ 2 | BEGIN
+ 2 | COMMIT
+ 2 | CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP
+ 2 | SELECT * FROM temp_t
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(5 rows)
+
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
-- Tables, indexes, triggers
CREATE TEMP TABLE tab_stats (a int, b char(20));
CREATE INDEX index_stats ON tab_stats(b, (b || 'data1'), (b || 'data2')) WHERE a > 0;
diff --git a/contrib/pg_stat_statements/sql/utility.sql b/contrib/pg_stat_statements/sql/utility.sql
index dd97203c2102..e21b656d44a8 100644
--- a/contrib/pg_stat_statements/sql/utility.sql
+++ b/contrib/pg_stat_statements/sql/utility.sql
@@ -6,6 +6,18 @@
SET pg_stat_statements.track_utility = TRUE;
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+-- Temporary tables, grouped together.
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+
-- Tables, indexes, triggers
CREATE TEMP TABLE tab_stats (a int, b char(20));
CREATE INDEX index_stats ON tab_stats(b, (b || 'data1'), (b || 'data2')) WHERE a > 0;
--
2.49.0
[application/pgp-signature] signature.asc (833B, 4-signature.asc)
download
view thread (10+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: query_id: jumble names of temp tables for better pg_stat_statement UX
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox