public inbox for [email protected]  
help / color / mirror / Atom feed
From: David Rowley <[email protected]>
To: ma lz <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Why not do distinct before SetOp
Date: Tue, 5 Nov 2024 00:09:35 +1300
Message-ID: <CAApHDvqdDwEXxhZLTwsHkWnvpvVSYT2OXSzfxRrs2p5xudr9fw@mail.gmail.com> (raw)
In-Reply-To: <TYCPR01MB63514B9C70EF2A80F0D43394F2512@TYCPR01MB6351.jpnprd01.prod.outlook.com>
References: <TYCPR01MB63514B9C70EF2A80F0D43394F2512@TYCPR01MB6351.jpnprd01.prod.outlook.com>

On Mon, 4 Nov 2024 at 22:52, ma lz <[email protected]> wrote:
>
> some sql like ' select a from t1 intersect select a from t1 '
>
> if t1 has large number rows but has few distinct rows
>
> select distinct a from t1 intersect select distinct a from t1;    — this is faster than origin sql
>
> can postgres do this optimize during plan-queries?

No, the planner does not attempt that optimisation. INTERSECT really
isn't very well optimised.

If we did want to improve this area, I think the first thing we'd want
to do is use standard join types rather than HashSetOp Intersect to
implement INTERSECT (without ALL).  To do that efficiently, we'd need
to do a bit more work on the standard join types to have them
efficiently support IS NOT DISTINCT FROM clauses as the join keys.
There's a fair bit of work to do and it's likely not been done as
INTERSECT isn't used that commonly.

There was a bit of work done in PG17 to teach the query planner some
new tricks around UNION. I think UNION is a much more commonly used
setop than INTERSECT, so you might have to wait a while. For now, it's
best to adjust your query.

David






view thread (2+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Why not do distinct before SetOp
  In-Reply-To: <CAApHDvqdDwEXxhZLTwsHkWnvpvVSYT2OXSzfxRrs2p5xudr9fw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox