Re: [PATCH] Optionally record Plan IDs to track plan changes for a query

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Andrei Lepikhov <[email protected]>
To: Lukas Fittl <[email protected]>
To: PostgreSQL Hackers <[email protected]>
Cc: Marko M <[email protected]>
Cc: Sami Imseih <[email protected]>
Subject: Re: [PATCH] Optionally record Plan IDs to track plan changes for a query
Date: Fri, 24 Jan 2025 16:23:08 +0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAP53Pkyow59ajFMHGpmb1BK9WHDypaWtUsS_5DoYUEfsa_Hktg@mail.gmail.com>
References: <CAP53Pkyow59ajFMHGpmb1BK9WHDypaWtUsS_5DoYUEfsa_Hktg@mail.gmail.com>

On 1/3/25 03:46, Lukas Fittl wrote:
> My overall perspective is that (1) is best done in-core to keep overhead 
> low, whilst (2) could be done outside of core (or merged with a future 
> pg_stat_statements) and is included here mainly for illustration purposes.
Thank you for the patch and your attention to this issue!

I am pleased with the export of the jumbling functions and their 
generalisation.

I may not be close to the task monitoring area, but I utilise queryId 
and other tools to differ plan nodes inside extensions. Initially, like 
queryId serves as a class identifier for queries, plan_id identifies a 
class of nodes, not a single node. In the implementation provided here, 
nodes with the same hash can represent different subtrees. For example, 
JOIN(A, JOIN(B,C)) and JOIN(JOIN(B,C),A) may have the same ID.

Moreover, I wonder if this version of plan_id reacts to the join level 
change. It appears that only a change of the join clause alters the 
plan_id hash value, which means you would end up with a single hash for 
very different plan nodes. Is that acceptable? To address this, we 
should consider the hashes of the left and right subtrees and the hashes 
of each subplan (especially in the case of Append).

Overall, similar to discussions on queryId, various extensions may want 
different logic for generating plan_id (more or less unique guarantees, 
for example). Hence, it would be beneficial to separate this logic and 
allow extensions to provide different plan_ids. IMO, What we need is a 
'List *ext' field in each of the Plan, Path, PlanStmt, and Query 
structures. Such 'ext' field may contain different stuff that extensions 
want to push without interference between them - specific plan_id as an 
example.

Additionally, we could bridge the gap between the cloud of paths and the 
plan by adding a hook at the end of the create_plan_recurse routine. 
This may facilitate the transfer of information regarding optimiser 
decisions that could be influenced by an extension into the plan.

-- 
regards, Andrei Lepikhov

view thread (2+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: [PATCH] Optionally record Plan IDs to track plan changes for a query
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox