public inbox for [email protected]  
help / color / mirror / Atom feed
From: Tom Lane <[email protected]>
To: Mariel Cherkassky <[email protected]>
Cc: Gerardo Herzig <[email protected]>
Cc: [email protected]
Subject: Re: select with max functions
Date: Mon, 02 Oct 2017 11:29:59 -0400
Message-ID: <[email protected]> (raw)
In-Reply-To: <CA+t6e1nbJvJkw9vb5qwF2=8ORMnFk8PVb57jKoEwxF212ov4mw@mail.gmail.com>
References: <CA+t6e1mVtJveyoRRW8fLzY0tJhXntLrYxpSrk0=dDH8q93VPEA@mail.gmail.com>
	<[email protected]>
	<CA+t6e1=9if3_zd734XkBFY7xiS_OosE2t9sSK3JgPav_E9RgNA@mail.gmail.com>
	<[email protected]>
	<CA+t6e1nbJvJkw9vb5qwF2=8ORMnFk8PVb57jKoEwxF212ov4mw@mail.gmail.com>
List-Unsubscribe:  <mailto:[email protected]?body=unsub%20pgsql-performance>

Mariel Cherkassky <[email protected]> writes:
> explain analyze   SELECT Ma.User_Id,
>                       COUNT(*) COUNT
>                FROM   Manuim Ma
>                WHERE  Ma.Bb_Open_Date  =
>                                   (SELECT Bb_Open_Date
>                                    FROM   Manuim Man
>                                    WHERE  Man.User_Id = Ma.User_Id order
> by                                   bb_open_date desc limit 1
>                                   )
>                GROUP  BY Ma.User_Id
>                HAVING COUNT(*) > 1;

The core problem with this query is that the sub-select has to be done
over again for each row of the outer table, since it's a correlated
sub-select (ie, it refers to Ma.User_Id from the outer table).  Replacing
a max() call with handmade logic doesn't do anything to help that.
I'd try refactoring it so that you calculate the max Bb_Open_Date just
once for each user id, perhaps along the lines of

SELECT Ma.User_Id,
       COUNT(*) COUNT
       FROM   Manuim Ma,
              (SELECT User_Id, max(Bb_Open_Date) as max
               FROM   Manuim Man
               GROUP BY User_Id) ss
       WHERE  Ma.User_Id = ss.User_Id AND
              Ma.Bb_Open_Date = ss.max
       GROUP  BY Ma.User_Id
       HAVING COUNT(*) > 1;

This is still not going to be instantaneous, but it might be better.

It's possible that an index on (User_Id, Bb_Open_Date) would help,
but I'm not sure.

			regards, tom lane


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: select with max functions
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox