public inbox for [email protected]  
help / color / mirror / Atom feed
From: Lukas Fittl <[email protected]>
To: Sami Imseih <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: zengman <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: Julien Rouhaud <[email protected]>
Subject: Re: Refactor query normalization into core query jumbling
Date: Wed, 25 Mar 2026 18:34:02 -0700
Message-ID: <CAP53PkzCzeDpg6dCzVrU17LCiCmRSWfEt-5dr4y+VHnTxKr_9w@mail.gmail.com> (raw)
In-Reply-To: <CAA5RZ0vEnNvp7_Ni8bWwh4GE53rg_YwNjqynQG=JpNu3YzzAWA@mail.gmail.com>
References: <CAA5RZ0tZp5qU0ikZEEqJnxvdSNGh1DWv80sb-k4QAUmiMoOp_Q@mail.gmail.com>
	<[email protected]>
	<CAA5RZ0vQfE14HyfpoPXDRThVcdCkLgY_HGz+J2qLB9soNUE9QQ@mail.gmail.com>
	<[email protected]>
	<CAA5RZ0uLS9RrpO2roX7p3EHE4-VJkBsGAB970jrbo1-GRDAi0g@mail.gmail.com>
	<[email protected]>
	<CAA5RZ0sbWmqdUBFo8JXMJe72pnwjxVY58htJ6pKbwnyQuRctQw@mail.gmail.com>
	<CAP53PkxqGbPw5VzpacyJb2wTofYJadCoUmxV8s2o5tHzKznwbg@mail.gmail.com>
	<CAA5RZ0t3-srmtaVNfzv0HEzXCa78eQDKYVak+3zgEccnKPZHHA@mail.gmail.com>
	<CAA5RZ0t_CJs9ye1PPBy-e0ajZ-6FgKvNEoacW9keCtF_7Y2ycQ@mail.gmail.com>
	<CAA5RZ0vEnNvp7_Ni8bWwh4GE53rg_YwNjqynQG=JpNu3YzzAWA@mail.gmail.com>

Hi Sami,

On Mon, Mar 16, 2026 at 7:12 PM Sami Imseih <[email protected]> wrote:
>
> > Here is v4.
> >
> > 0001 - addresses the comments made by Bertrand.
> > 0002 - makes JumbleState a constant when passed to post_parse_analyze
> > and updates all downstream code that consume the JumbleState. This
> > means we now need to copy the locations from JState into a local/temporary
> > array when generating the normalized string.

In 0001:

> diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
> > index 87db8dc1a32..d4b26202c47 100644
> --- a/src/backend/nodes/queryjumblefuncs.c
> +++ b/src/backend/nodes/queryjumblefuncs.c
> ...
> @@ -773,3 +775,249 @@ _jumbleRangeTblEntry_eref(JumbleState *jstate,
> +
> +    /* we don't want to re-emit any escape string warnings */
> +    yyextra.escape_string_warning = false;
> +

I don't think this is needed anymore, as of
45762084545ec14dbbe66ace1d69d7e89f8978ac.

> +/*
> + * Callback to generate a normalized version of the query string that will be used to
> + * represent all similar queries.
> + *

I don't think the term "Callback" makes sense here - I think you could
just keep the original wording.

In 0002:

> diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
> index d4b26202c47..fe8f0ccd278 100644
> --- a/src/backend/nodes/queryjumblefuncs.c
> +++ b/src/backend/nodes/queryjumblefuncs.c
> ...
> @@ -813,14 +815,20 @@ SetConstantLengths(JumbleState *jstate, const char *query,
>     core_YYSTYPE yylval;
>     YYLTYPE        yylloc;
>
> +    if (jstate->clocations_count == 0)
> +        return NULL;
> +
> +    /* Copy constant locations to avoid modifying jstate */
> +    locs = palloc(jstate->clocations_count * sizeof(LocationLen));
> +    memcpy(locs, jstate->clocations, jstate->clocations_count * sizeof(LocationLen));
> +

You could use palloc_array for locs here.

> @@ -938,12 +948,13 @@ GenerateNormalizedQuery(JumbleState *jstate, const char *query,
>                 last_off = 0,    /* Offset from start for previous tok */
>                 last_tok_len = 0;    /* Length (in bytes) of that tok */
>     int            num_constants_replaced = 0;
> +    LocationLen *locs = NULL;
>
>     /*
>      * Set constants' lengths in JumbleState, as only locations are set during
>      * DoJumble(). Note this also ensures the items are sorted by location.
>      */
> -    SetConstantLengths(jstate, query, query_loc);
> +    locs = SetConstantLengths(jstate, query, query_loc);

I think we should update the comment here to reflect the fact that
we're no longer modifying JumbleState.

Otherwise these patches look good - it'd be nice to still get this
into 19 so we have less code duplication across the different
extensions working with normalized query text.

Thanks,
Lukas

-- 
Lukas Fittl





view thread (35+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Refactor query normalization into core query jumbling
  In-Reply-To: <CAP53PkzCzeDpg6dCzVrU17LCiCmRSWfEt-5dr4y+VHnTxKr_9w@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox