public inbox for [email protected]  
help / color / mirror / Atom feed
From: Sami Imseih <[email protected]>
To: cca5507 <[email protected]>
Cc: Heikki Linnakangas <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: Support loser tree for k-way merge
Date: Wed, 3 Dec 2025 12:14:25 -0600
Message-ID: <CAA5RZ0u=WDa-v+8Y_yE4WeRS8VtmcmQGuWVjZFCqZcXx-hEP3w@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

Hi,

Thanks for raising this.

> Now we use 'heap' during the k-way merge, it's O(n log k). The 'loser tree' is also O(n log k), but
> it's usually has fewer comparisons than the 'heap'. When the tuple comparator is complex, the
> 'loser tree' can significantly speed up the k-way merge.

I was playing around with this and I do also see the performance
benefits that you
mention with loser tree.

```
DROP TABLE IF EXISTS t;
CREATE TABLE t (
    col1 INT,
    col2 INT
);


INSERT INTO t (col1, col2)
SELECT
    (i % 10000000),
    (i % 100)
FROM generate_series(1, 10000000) AS s(i);
```

Using something like the above, testing an "external merge" on the
high cardinality
column there is a bit of an improvement with loser tree, but there is
a bit of a loss on
low-cardinality.

In general though, I see a bit of an issue with allowing a GUC to
change strategies. What happens if there are multiple "external merge"
nodes and they can each benefit from different strategies? Maybe that's
not an issue since all the other enable_ GUCs have that problem, but
it is something to consider.

Can we drive the decision for what to do based on optimizer
stats, i.e. n_distinct and row counts? Not sure what the calculation would
be specifically, but something else to consider.

We can still provide the GUC to  override the optimizer decisions,
but at least the optimizer, given up-to-date stats, may get it right most
of the time.

--
Sami Imseih
Amazon Web Services (AWS)





reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Support loser tree for k-way merge
  In-Reply-To: <CAA5RZ0u=WDa-v+8Y_yE4WeRS8VtmcmQGuWVjZFCqZcXx-hEP3w@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox