Re: Odd sudden performance degradation related to temp object churn

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Mark Kirkwood <[email protected]>
To: Jeremy Finzel <[email protected]>
To: Scott Marlowe <[email protected]>
Cc: Jerry Sievers <[email protected]>
Cc: Peter Geoghegan <[email protected]>
Cc: postgres performance list <[email protected]>
Subject: Re: Odd sudden performance degradation related to temp object churn
Date: Sat, 19 Aug 2017 13:49:49 +1200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAMa1XUg1MErrJifeeTNse+TQdbEZjs3t+tzOvnu7KCW9tUqcsA@mail.gmail.com>
References: <CAMa1XUi6pbugdB7zUA6weoPjmUHTzNLZcSEAh06-7FZRzkuHQw@mail.gmail.com>
	<CAH2-WznKCt_UBaPEY=WwBof+TkTaDYHbZmDsFE=Vigy3Rqi35A@mail.gmail.com>
	<[email protected]>
	<CAOR=d=3+N1467oddeN3E7r2PKELQt+85s+c+JjP-tgRHoYukpA@mail.gmail.com>
	<CAMa1XUg1MErrJifeeTNse+TQdbEZjs3t+tzOvnu7KCW9tUqcsA@mail.gmail.com>
List-Unsubscribe:  <mailto:[email protected]?body=unsub%20pgsql-performance>



On 19/08/17 02:21, Jeremy Finzel wrote:
> On Tue, Aug 15, 2017 at 12:07 PM, Scott Marlowe 
> <[email protected] <mailto:[email protected]>> wrote:
>
>     So do iostat or iotop show you if / where your disks are working
>     hardest? Or is this CPU overhead that's killing performance?
>
>
> Sorry for the delayed reply. I took a look in more detail at the query 
> plans from our problem query during this incident. There are actually 
> 6 plans, because there were 6 unique queries.  I traced one query 
> through our logs, and found something really interesting. That is that 
> all of the first 5 queries are creating temp tables, and all of them 
> took upwards of 500ms each to run.  The final query, however, is a 
> simple select from the last temp table, and that query took 0.035ms!  
> This really confirms that somehow, the issue had to do with /writing 
> /to the SAN, I think.  Of course this doesn't answer a whole lot, 
> because we had no other apparent issues with write performance at all.
>
> I also provide some graphs below.
>
>
Hi, graphs for latency (or await etc) might be worth looking at too - 
sometimes the troughs between the IO spikes are actually when the disks 
have been overwhelmed with queued up pending IOs...

Also SANs are notorious for this sort of thing - typically they have a 
big RAM cache that you are actually writing to, and everything is nice 
and fast until your workload (along with everyone else's) fills up the 
cache and then performance drops of a cliff for a while (I've seen SAN 
disks with iostat utilizations of 105% <-- Lol... and await numbers that 
scroll off the page in that scenario)!

regards
Mark


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Odd sudden performance degradation related to temp object churn
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox