public inbox for [email protected]  
help / color / mirror / Atom feed
From: Peter J. Holzer <[email protected]>
To: [email protected]
Subject: Re: Wrapping a select in another select makes it slower
Date: Wed, 23 Jul 2025 10:24:44 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAApHDvrDM=QYt2m5oWpRME3Hdy6S=sQVBpxOTqOWvDeUOnR1EQ@mail.gmail.com>
References: <[email protected]>
	<CAApHDvrDM=QYt2m5oWpRME3Hdy6S=sQVBpxOTqOWvDeUOnR1EQ@mail.gmail.com>

On 2025-07-23 10:08:31 +1200, David Rowley wrote:
> On Wed, 23 Jul 2025 at 03:18, Peter J. Holzer <[email protected]> wrote:
> >
> > PostgreSQL version 17.5 on Ubuntu 24.04 (PGDG build).
> 
> 
> >               ->  Merge Left Join  (cost=4613.25..7180.30 rows=8357 width=136) (actual time=222.037..292242.701 rows=40460 loops=1)
> >                     Merge Cond: (ns.nspname = s.schemaname)
> >                     Join Filter: ((s.tablename = tbl.relname) AND (s.attname = att.attname))
> >                     Rows Removed by Join Filter: 1215045276
> 
> >                     ->  Sort  (cost=4476.84..4477.18 rows=139 width=200) (actual time=156.864..178628.030 rows=1215036357 loops=1)
> >                           Sort Key: s.schemaname
> >                           Sort Method: external sort  Disk: 6480kB
> >                           ->  Subquery Scan on s  (cost=2189.59..4471.89 rows=139 width=200) (actual time=81.840..131.044 rows=30921 loops=1)
> 
> > The plans are clearly different, with the slow one claiming to sort 1.2
> > billion rows (but using only 6.5MB of disk space) despite the node below
> > it only returning 30921 rows. 1215036357 is almost but not quite
> > 30921*40460, but that nested loop is a sibling of the sort, so it
> > shouldn't be included, right?
> 
> This is coming from the "mark and restore" due to the Merge Join that
> executes that sort on its inner side "rewinding" the inner side back
> to check the matches for the next outer row. This is happening because
> the Merge Join on the slow query only contains 1 of the 3 columns
> you're joining on. Notice the Sort's subnode only returned 30921 rows.
> 
> > And why are the plans different at all? Computing a few extra values per
> > row shouldn't change the cost of the query delivering the rows, IMHO.
> > But then the costs are very similar, so maybe it's just some random
> > variation.
> 
> That's a good question. What happens if you SET
> enable_incremental_sort=0; and run the slow query again?

This fixes the problem.


> I've not managed to recreate this locally, so far.

I was afraid of that. I mentioned that it happened on one database but
didn't emphasize that it's really only one of several PostgreSQL 17.x
databases we have. There is even another database in the same cluster
where it doesn't happen- So it probably takes a particularly unlucky
combination of statistics to trigger this problem.

(FWIW, the database contains a limesurvey instance: About 500 smallish
tables. So nothing crazy)

> I tried on 17.5 with hash joins disabled to get the merge join plan
> but my merge join includes all 3 clauses rather than your 1 clause.
> I'm unsure why your planner doesn't include all 3.

Yes, that's strange.


> To eliminate 1 moving part, is it still slow if you make the outer
> query just SELECT * FROM the subquery rather than have the additional
> calculations in the select list?

Yes, that's still slow. So the additional columns were a red herring.
It's really just the additional SELECT.

        hjp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | [email protected]         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

view thread (3+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Wrapping a select in another select makes it slower
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox