public inbox for [email protected]  
help / color / mirror / Atom feed
From: Lukas Fittl <[email protected]>
To: Michael Paquier <[email protected]>
Cc: Sami Imseih <[email protected]>
Cc: zengman <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: Julien Rouhaud <[email protected]>
Subject: Re: Refactor query normalization into core query jumbling
Date: Sun, 29 Mar 2026 22:21:46 -0700
Message-ID: <CAP53Pkwb2f288Ji2s9ZZTL0_PEyYCZpQjBtD63X_s+yUB3YORA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAA5RZ0ui6JSDPQDs2RW7eBDvrH_t_xg8bhUM8FaxawU8wDoMFg@mail.gmail.com>
	<[email protected]>

On Fri, Mar 27, 2026 at 6:42 PM Michael Paquier <[email protected]> wrote:
>
> > On Mar 28, 2026, at 2:09, Sami Imseih <[email protected]> wrote:
> > I agree that ComputeConstantLengths should be in core. This one is
> > not a gray area IMO. The query jumble already records constant locations,
> > but leaves the lengths unset. ComputeConstantLengths is just the
> > completion of that  work. There could be no other interpretation,
> > unlike generate_normalized_query, of what the lengths should be.
>
> This is an interesting remark. One problem with any patches presented yet is that we don’t attach the lengths as part of the in-core query jumbling procedure: we plug the lengths later using the same structure as the query jumbling. It seems to me that this is half-baked overall. I think that we don’t want to pay the extra length computation in the core query jumbling at the end, then why does it make sense to include the lengths in the JumbleState structure at all? Shouldn’t we use a different structure filled inside PGSS for this purpose rather than reuse the same thing for PGSS and the in-core part (don’t have the code in front of me now).

I see where you're coming from on that, but I don't think we can
remove anything here in practice:

typedef struct LocationLen
{
    int            location; /* Required */
    int            length; /* Set by _jumbleElements */
    bool        squashed; /* Set by RecordConstLocation called from
_jumbleElements */
    bool        extern_param; /* Set by RecordConstLocation called
from  _jumbleParam */
} LocationLen;

typedef struct JumbleState
{
    unsigned char *jumble; /* Required */
    Size        jumble_len; /* Required */
    LocationLen *clocations; /* Required to track constant locations */
    int            clocations_buf_size; /* Required to track constant
locations */
    int            clocations_count; /* Required to track constant locations */
    int            highest_extern_param_id; /* Set by _jumbleParam */
    bool        has_squashed_lists; /* Set by _jumbleElements */
    unsigned int pending_nulls; /* Required */
    Size        total_jumble_len; /* Required */
} JumbleState;

The only refactoring I could think of is to write out the
_jumbleElements information separately, then we could actually drop
length, and maybe squashed, from LocationLen. But I'm not sure that
really buys us much? It would be more clear I guess, because the mixed
locations of where length gets set is indeed a bit odd.

I still think it'd be reasonable for us to include
ComputeConstantLengths in core to complete the picture of what we're
doing with _jumbleElements and the length field already anyway. Its
basically a way to fully hydrate the partially filled out JumbleState
from the initial jumble.

Thanks,
Lukas

-- 
Lukas Fittl





view thread (35+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Refactor query normalization into core query jumbling
  In-Reply-To: <CAP53Pkwb2f288Ji2s9ZZTL0_PEyYCZpQjBtD63X_s+yUB3YORA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox