Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tmV2t-0078oR-LQ for pgsql-general@arkaria.postgresql.org; Mon, 24 Feb 2025 09:46:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tmV2s-000FCj-Pr for pgsql-general@arkaria.postgresql.org; Mon, 24 Feb 2025 09:46:38 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tmV2s-000FCS-E7 for pgsql-general@lists.postgresql.org; Mon, 24 Feb 2025 09:46:38 +0000 Received: from mail-oo1-xc2e.google.com ([2607:f8b0:4864:20::c2e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tmV2p-000Vb2-03 for pgsql-general@lists.postgresql.org; Mon, 24 Feb 2025 09:46:37 +0000 Received: by mail-oo1-xc2e.google.com with SMTP id 006d021491bc7-5f4d935084aso2524378eaf.2 for ; Mon, 24 Feb 2025 01:46:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1740390393; x=1740995193; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=CBOjUDUQVDe2E69I8ELS5bfl6azM9NZgse2Az7xqi0E=; b=GYXkED3TARJ1wuPvEqIGutETuvXGoFj6yxXzjUF19s2p/LGIuS8SOKfPyydBF+pFt0 B3FoIcuw2v5DK71Y1TU1OgBJgZZmnWwf/gjs0x5JxMaUEuhsnnY2ZzWeGpPPSjLUopnl 9NrEAUIRG/fAgQzMIXtGSW6pqz3DjgWBljVH3yFUBPnDCNUQkCwDKofKjZB9CLyJZIA2 QW8w0yj5ZD15chvPy6G+RLyjSL+HTbIjlWT2Vc+xO1Cb7VdmwWitErwpUt/b2Im/rCv2 35WyjMTW03y/nxxjo2aOILnxLv0jmViLC1vKy35w/3/58B2RR1Qgofnis60zBBVu+bzx Rj+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740390393; x=1740995193; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=CBOjUDUQVDe2E69I8ELS5bfl6azM9NZgse2Az7xqi0E=; b=XGu3P0KS8fIBFkplLQCl07t4c0Q0c9KMXUfQ8ahsE793mp0RcPlzvKHCtJ9X+a5V9y LJkglZsXTEJ5k6r2a5VeWiQJNIa4RLQhUzbf3P+xujWEKLwBzzPkCbCtSZQmU6x/yRw3 dXTCHUOxfJiORWCGGvy2sur+K+jBDsp6uyoG0PTd9PGj32TK5vclvKaqS8fOxz2kM1fN /nXDFxU8t6T47+utS+zZtp02P7UFJKJDU+VazbbhRkPDqzM91wOPHKAyJSxAKvo8bPk1 loG5QNBGUoxuhtOyEJta5BMOypFliA768E2Hg9JRbXYFE9wOSSDPAZ2C1H9x/l/BpOmr o2dg== X-Gm-Message-State: AOJu0YxKBSUshDRP+jMHCbF/q5EfEWZvFDT8V37EzaHN9fzsgzSvkKgB GRM6hdmluENmgjRM2UL5qTjOfO++GHlUMdFNdSedkpG1F7IRFV2HnRiLwfD5QJLUw2OCOqYVwP5 d25SNIzYm9L6KCgWXHMxTjDON5VEdwXY+ X-Gm-Gg: ASbGncsJxdCJ8ycUYaBaadq/dCMQKs7wLdXK/OqUbvyK4UpQVT/5046OL1d5Oz89Sgi 93HQmNj4gBrKm3h8745kyKrXkc3cSrsDvj8HW14uRygoGVojClfYSNSuclLPFL58HLqqVxZuo8B Xq2Y/9YPWHCg== X-Google-Smtp-Source: AGHT+IGaFRZPcD7WGej1E1tj9qdEXDFFSI03q7sRje8w84jC2f00RmsaXHU3iKVT2rTG3ijqJSl0qgVm+uV3bfuW1g0= X-Received: by 2002:a05:6808:191d:b0:3f4:899:a817 with SMTP id 5614622812f47-3f4247ce50dmr8100788b6e.31.1740390393533; Mon, 24 Feb 2025 01:46:33 -0800 (PST) MIME-Version: 1.0 From: Dominique Devienne Date: Mon, 24 Feb 2025 10:46:17 +0100 X-Gm-Features: AWEUYZnkQ7pGcmFKJ-FkHlYSQHQNU0cjjJbYNW4NuXjhMo-MHPre_qzhiAS9oJw Message-ID: Subject: Keep specialized query pairs, or use single more general but more complex one To: pgsql-general@lists.postgresql.org Cc: andrew@tao11.riddles.org.uk Content-Type: multipart/alternative; boundary="0000000000002d5d8b062ee037eb" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000002d5d8b062ee037eb Content-Type: text/plain; charset="UTF-8" Hi, We have a few dozen queries involving grandparent, parent, child table triplets, to implement a pre-existing API on top of PostgreSQL. That API is not SQL-based nor SQL "friendly" either, that's why we detect patterns in the APIs inputs, to use different SQL queries, as (important) optimizations, thus the "dozen of queries" above. But now we have a new requirement, for "fuzzy find". I.e. the client can ask for names which are not the exact in-DB names, but also aliases of those names. That requires a different query, which is a bit more complex. Here's an example: # existing "exact-name" query select ... from child c join parent s on s.id = c.parent join grantparent w on w.id = s.parent where w.name = $1 and s.name = $2 and c.name = $3 # new "aliased name" query select ... from child c join parent s on s.id = c.parent join grantparent w on w.id = s.parent join unnest($3::text[]) with ordinality as aliases(name, ord) on c.name = aliases.name where w.name = $1 and s.name = $2 order by aliases.ord limit 1 Notice the limit 1, i.e. this is still a scalar query, since name or (parent, name) are UNIQUE, and the fact the alias query is ordered, the first match in alias order is "preferred". Given the above, it is obvious that if we stick the exact name in front of its aliases, and use only the 2nd query, this is functionally equivalent (if it isn't, please let us know!). And from a maintenance perspective, not doubling our queries sounds like a good thing. But then, I do worry about performance. Will the second more complex query be planned just as well of the 1st one? We have two types of clients (apps) for that API: * The first type never relies on fuzzy-find. * While the second type relies extensively on it. Thus I don't want to penalize the first type, over the second. I'd appreciate an expert opinion on planning (Tom? Andrew?), in the context of unnest+order by ordinal+limit 1 with UNIQUE constraints, versus the simpler 3-way-join with equality constraints, to base my decision on. Thanks, --DD PS: We are v16+ based. --0000000000002d5d8b062ee037eb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

We have a few dozen queries involvi= ng grandparent, parent, child table triplets,
to implement a pre-= existing API on top of PostgreSQL. That API is not SQL-based
nor = SQL "friendly" either, that's why we detect patterns in the A= PIs inputs, to use
different SQL queries, as (important) optimiza= tions, thus the "dozen of queries" above.

But now we have a new requirement, for "fuzzy find". I.e. the c= lient can ask for names
which are not the exact in-DB names, but = also aliases of those names. That requires a
different query, whi= ch is a bit more complex. Here's an example:

#= existing "exact-name" query
select ...
from child c=
join parent s on s.id =3D c.parent
join = grantparent w on w.id =3D s.parent
where w.name =3D $1
and s= .name =3D $2
and c.name =3D $3

=
# new "aliased name" query
select ...
fro= m child c =C2=A0 =C2=A0
join parent s on s.id =3D c.parent =C2=A0 =C2=A0
join grantparent w on
w.id =3D s.parent =C2=A0 =C2=A0
join unnest($3::text[]) with ordi= nality as aliases(name, ord) on c.name =3D aliases.name =C2=A0 =C2=A0
where w.name =3D $1 =C2=A0 =C2=A0
and s.name =3D $2 =C2=A0 =C2=A0
order by aliases.ord =C2=A0 = =C2=A0
limit 1

Notice the limit 1, i.e. this is= still a scalar query, since name or (parent, name) are UNIQUE,
a= nd the fact the alias query is ordered, the first match in alias order is &= quot;preferred".

Given the above, it is obvio= us that if we stick the exact name in front of its aliases, and use only th= e 2nd query, this is functionally equivalent (if it isn't, please let u= s know!).

And from a maintenance perspective, not = doubling our queries sounds like a good thing.

But= then, I do worry about performance. Will the second more complex query be = planned just as well of the 1st one?

We have two t= ypes of clients (apps) for that API:
* The first type never relie= s on fuzzy-find.
* While the second type relies extensively on it= .
Thus I don't want to penalize the first type, over the seco= nd.

I'd appreciate an expert opinion on planni= ng (Tom? Andrew?),
in the context of unnest+order by ordinal+limi= t 1 with UNIQUE constraints,
versus the simpler 3-way-join with e= quality constraints, to base my decision on.

Thank= s, --DD

PS: We are v16+ based.
--0000000000002d5d8b062ee037eb--