public inbox for [email protected]  
help / color / mirror / Atom feed
Why not do distinct before SetOp
2+ messages / 2 participants
[nested] [flat]

* Why not do distinct before SetOp
@ 2024-11-04 09:52 ma lz <[email protected]>
  2024-11-04 11:09 ` Re: Why not do distinct before SetOp David Rowley <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: ma lz @ 2024-11-04 09:52 UTC (permalink / raw)
  To: pgsql-general

some sql like ' select a from t1 intersect select a from t1 '

if t1 has large number rows but has few distinct rows

select distinct a from t1 intersect select distinct a from t1;    — this is faster than origin sql

can postgres do this optimize during plan-queries?


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Why not do distinct before SetOp
  2024-11-04 09:52 Why not do distinct before SetOp ma lz <[email protected]>
@ 2024-11-04 11:09 ` David Rowley <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: David Rowley @ 2024-11-04 11:09 UTC (permalink / raw)
  To: ma lz <[email protected]>; +Cc: pgsql-general

On Mon, 4 Nov 2024 at 22:52, ma lz <[email protected]> wrote:
>
> some sql like ' select a from t1 intersect select a from t1 '
>
> if t1 has large number rows but has few distinct rows
>
> select distinct a from t1 intersect select distinct a from t1;    — this is faster than origin sql
>
> can postgres do this optimize during plan-queries?

No, the planner does not attempt that optimisation. INTERSECT really
isn't very well optimised.

If we did want to improve this area, I think the first thing we'd want
to do is use standard join types rather than HashSetOp Intersect to
implement INTERSECT (without ALL).  To do that efficiently, we'd need
to do a bit more work on the standard join types to have them
efficiently support IS NOT DISTINCT FROM clauses as the join keys.
There's a fair bit of work to do and it's likely not been done as
INTERSECT isn't used that commonly.

There was a bit of work done in PG17 to teach the query planner some
new tricks around UNION. I think UNION is a much more commonly used
setop than INTERSECT, so you might have to wait a while. For now, it's
best to adjust your query.

David






^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2024-11-04 11:09 UTC | newest]

Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-11-04 09:52 Why not do distinct before SetOp ma lz <[email protected]>
2024-11-04 11:09 ` David Rowley <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox