Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5ZcR-003O31-1W for pgsql-hackers@arkaria.postgresql.org; Thu, 26 Mar 2026 01:34:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w5ZcP-000URD-20 for pgsql-hackers@arkaria.postgresql.org; Thu, 26 Mar 2026 01:34:42 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5ZcP-000UR4-0h for pgsql-hackers@lists.postgresql.org; Thu, 26 Mar 2026 01:34:41 +0000 Received: from mail-qv1-xf2a.google.com ([2607:f8b0:4864:20::f2a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w5ZcN-0000000142x-3lrT for pgsql-hackers@postgresql.org; Thu, 26 Mar 2026 01:34:40 +0000 Received: by mail-qv1-xf2a.google.com with SMTP id 6a1803df08f44-899eabc5292so4662536d6.0 for ; Wed, 25 Mar 2026 18:34:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774488879; cv=none; d=google.com; s=arc-20240605; b=Z/jgchTLxR4pZ6lj8IAddveVMTDKir3T3zXWrSr6CNLjJS66SdmpKxLmDW/QChr9+T Qpgv9VUlXQY30ousESC2+olcdrBXSdXADmIQihaY3I5fVLZ4BqoAKyLUxnCIFEiv18Vq Ot0AWolEJEj10M1p7oXYMaZaKQnuEm5VmDgaQOH6aNnmHGk5zdFoZuTtazROX80eiwW2 BdeVzVcVBjrpepgWQ89R8ImKzPlJ3b3MJGDHNcb1/OcD6JIcRX7sqvy/4v51dfMjvCfX ZLd3jLYftcl5xR8e0vrtCjJTx6fsa3pKcxyRv6bjmf2hadoX2jq4q8y4Nlnw7skfFFfg 7G2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=hASOr1XJNojxYYVB/9maHwKgvjMfaMuyPoMjxPeJpRU=; fh=xQCGJBatSBAa09LE+CpVoNto9zQwWK/QV5y8E+rnqy8=; b=RHN7DaCFHaTgtc42AvAzPV3SYXd+F3ZRSC5ZRe8+Fsmkw30a9JBL1FrVJy5ApPZAo0 Q3xjYcprQRMFYlyUINRZBU9lWcGt7Q0RmyqOtBUIEPSn19xEqW645PpjhpilnbXdLMQX tDYrF0WIM8XKizmoPSLauLJjB2x7ibMM1dAY99Bi8hPxKNteO6fZBBNTvVle/O9QfJnP x/HVrMpqsDwrgl0nHwRYYwY4qoLsmqqKX0eBzXXTn5jp4cUTeA8nmpmp8kkoX7mqFF8h x3tQDXvXJyNsV49Q++uaj95E6ss6Je9Evlen6A4wedUao3I2PXo3Wwj/MvmXa1YQqCff f5uA==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fittl.com; s=google; t=1774488879; x=1775093679; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hASOr1XJNojxYYVB/9maHwKgvjMfaMuyPoMjxPeJpRU=; b=NZuly7hGWxFJpemA6hn2ywf5Homt/V2rMzFkCdcDxYnJHrGOpXSvyxPnVE2DMz91tu ORQxC1hDwdHzSKBOMwO1NlIEUIemR1X9gnTg09KBHKDb0pBnprMtLHUaxVZy+obtQsFJ IiAYb85krLOHjWfOH0QErcjcl3MHW/MHiPYqI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774488879; x=1775093679; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=hASOr1XJNojxYYVB/9maHwKgvjMfaMuyPoMjxPeJpRU=; b=he9nUI6Fs1LqGtLgADqPNKUA/V+zCEM1rlAs5JbP5rtmJy9ByMHWPIiph7QSqwuHyj MAwlst8MPBYDyYJMg+nDJoHp3TcwVT/dGbLLzthTk4xxK8K8ee9ZWjSgKYo4JOFFmzm2 gg7DI18v0WieTy2CSRVRkhEnm9mbKrob32DMxyHHr4iGSCEaMSGWu3Clz84JL+hOyCAo J6awZ8X8S3kg8GbxNGEATDOswf9xjGkDnHLl/ULpJPcnnQ4cFSeKSrQVMZb9N28JqUqx 2Z3F+sC+kf5t2cr7ebXJU+ETLcudk29EDKNRNy2yBitwaczl9ZCujRhxJWROnNCjAv29 nFJQ== X-Forwarded-Encrypted: i=1; AJvYcCVSjw9uK8vOLg7WaoU8ii4yfOUvyYrMvw3jXuVddOIk5dFi50PDRtDzi248uYQ9eQveGh5pqEN/XJCTX2yf@postgresql.org X-Gm-Message-State: AOJu0Yw2tMoX80CPQTt9ZYeri9KapDCmxlxvpghXggvLHzv8RIEeGiXN RFbNARFv0ML01IdveKMbOTO6z9Bmd4V6h9KbjiX6Aymjzx18pKmgj2vUpAdZPfNqAWA4mMJofZ3 3ZVxk7nddrWzLkZCEA19wy7emP2PH/aQ/y93M0aZS X-Gm-Gg: ATEYQzwqgufpentx2iRfjUzkBmuBjjEJGb/8faIfdnOy3zh/lVuY3QWuZt4HKIJJWnV aG/nwxw3WwVOEr0sn73NYPjoFPQKQV1MUfpxr6HSAy7HwGUipxst8jCzvewRDkUgIfCq2Xiw2NC dh7+zgw7XWqeY+pImyltnTNW6AqAMLOWiJ8KeUDwILi7eXYGrKF9NwdVweOCsYYLCxwI4Nc4gf8 N9UGenB0rCgYRjQxO04Vd95RY7O2ED/Wc/mQWwe/PNvJyZ6xWu9zJAjlFyopl7S7GlQkW5zL5v7 L9Hv54v7/I9UxfHVZOV+2CN5NJyoTcu8cbL7dG6hUOcK9DMGSMKCHiyGNj9kfUZfEF1WEOgj X-Received: by 2002:a05:6214:c67:b0:89a:1088:2044 with SMTP id 6a1803df08f44-89cc4a9e033mr87174616d6.45.1774488878886; Wed, 25 Mar 2026 18:34:38 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Lukas Fittl Date: Wed, 25 Mar 2026 18:34:02 -0700 X-Gm-Features: AaiRm5051PTQGXyTcK_pqUtlf6E-lRm1Vcg5UTmRt-bxp0hjeUqbVzhj_0rlbUI Message-ID: Subject: Re: Refactor query normalization into core query jumbling To: Sami Imseih Cc: Michael Paquier , zengman , pgsql-hackers , Julien Rouhaud Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Sami, On Mon, Mar 16, 2026 at 7:12=E2=80=AFPM Sami Imseih w= rote: > > > Here is v4. > > > > 0001 - addresses the comments made by Bertrand. > > 0002 - makes JumbleState a constant when passed to post_parse_analyze > > and updates all downstream code that consume the JumbleState. This > > means we now need to copy the locations from JState into a local/tempor= ary > > array when generating the normalized string. In 0001: > diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/que= ryjumblefuncs.c > > index 87db8dc1a32..d4b26202c47 100644 > --- a/src/backend/nodes/queryjumblefuncs.c > +++ b/src/backend/nodes/queryjumblefuncs.c > ... > @@ -773,3 +775,249 @@ _jumbleRangeTblEntry_eref(JumbleState *jstate, > + > + /* we don't want to re-emit any escape string warnings */ > + yyextra.escape_string_warning =3D false; > + I don't think this is needed anymore, as of 45762084545ec14dbbe66ace1d69d7e89f8978ac. > +/* > + * Callback to generate a normalized version of the query string that wi= ll be used to > + * represent all similar queries. > + * I don't think the term "Callback" makes sense here - I think you could just keep the original wording. In 0002: > diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/que= ryjumblefuncs.c > index d4b26202c47..fe8f0ccd278 100644 > --- a/src/backend/nodes/queryjumblefuncs.c > +++ b/src/backend/nodes/queryjumblefuncs.c > ... > @@ -813,14 +815,20 @@ SetConstantLengths(JumbleState *jstate, const char = *query, > core_YYSTYPE yylval; > YYLTYPE yylloc; > > + if (jstate->clocations_count =3D=3D 0) > + return NULL; > + > + /* Copy constant locations to avoid modifying jstate */ > + locs =3D palloc(jstate->clocations_count * sizeof(LocationLen)); > + memcpy(locs, jstate->clocations, jstate->clocations_count * sizeof(L= ocationLen)); > + You could use palloc_array for locs here. > @@ -938,12 +948,13 @@ GenerateNormalizedQuery(JumbleState *jstate, const = char *query, > last_off =3D 0, /* Offset from start for previous tok = */ > last_tok_len =3D 0; /* Length (in bytes) of that tok *= / > int num_constants_replaced =3D 0; > + LocationLen *locs =3D NULL; > > /* > * Set constants' lengths in JumbleState, as only locations are set d= uring > * DoJumble(). Note this also ensures the items are sorted by locatio= n. > */ > - SetConstantLengths(jstate, query, query_loc); > + locs =3D SetConstantLengths(jstate, query, query_loc); I think we should update the comment here to reflect the fact that we're no longer modifying JumbleState. Otherwise these patches look good - it'd be nice to still get this into 19 so we have less code duplication across the different extensions working with normalized query text. Thanks, Lukas --=20 Lukas Fittl