public inbox for [email protected]
help / color / mirror / Atom feedFrom: Karsten P <[email protected]>
To: [email protected]
Subject: Feature-Request: Performance-Optimization when using limit in combination with order by on querys using union
Date: Thu, 8 May 2025 11:57:09 +0200
Message-ID: <DB7PR08MB3081636A27C077E3BB4DA2DCDA8BA@DB7PR08MB3081.eurprd08.prod.outlook.com> (raw)
Hi,
i've already googled so far but didn't find anything regarding my problem..
I hope i'm here at the right place.
Following situation (this is just an simplyfied example):
suppose we have two tables, lets say
orders
- column 'order_number' -> varchar
- column 'order_date' -> timestamp
with index on order_date
and
invoices
- column 'invoice_number' -> varchar
- column 'invoice_date' -> timestamp
with index on invoice_date
and many records in both if them.
now we have a view combining both of them as
create view documents as
(
select order_number as document_number, order_date as document_date
from orders
union all select invoice_number, invoice_date from invoices
)
finding the last order placed in the database ist really easy:
select order_number from orders order by order_date desc limit 1
will result in an index scan backward on orders
same with invoices only...
but when querying the view
select document_number from documents order by document_date desc limit 1
seems to break down to
- collect all rows from orders
- combine it with all rows from invoices
- sort all rows (descending)
- limit to one row
with many data this is quite slow.
I've tested this with PG9.6 and PG14, it doesn't seem to make a
difference (correct me if i'm wrong).
So my question is: What about optimizing the query-planner that if
- a query with unions of selects is executed
- and an 'order by' in combination with 'limit' is applied on the
complete query (not only on subselects)
- and there is a matching index for each select
the order by and limit - part of the sql is also beeing applied on each
sub-select ?
actually
select document_number from documents order by document_date desc
limit 1
is beeing processed as
select order_number from orders
union all select invoice_number from invoices
order by document_number desc
limit 1
but would it be possible to let the query-optimizer expand the query to
select order_number from (
(select order_number, order_date from orders order by
order_date desc limit 1)
union all (select invoice_number, invoice_date from invoices
order by invoice_date desc limit 1)
) as subselect
order by order_date desc
limit 1
as this would use two (or number of unions) index-backward-scans
and than only has to reorder at maximum two rows before limiting to the
first of it?
this should be significantly faster.
thanks a lot and greetz,
Karsten
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: Feature-Request: Performance-Optimization when using limit in combination with order by on querys using union
In-Reply-To: <DB7PR08MB3081636A27C077E3BB4DA2DCDA8BA@DB7PR08MB3081.eurprd08.prod.outlook.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox