Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1disvY-0000dV-2r for pgsql-performance@arkaria.postgresql.org; Sat, 19 Aug 2017 01:51:52 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1disvW-00071F-M9 for pgsql-performance@arkaria.postgresql.org; Sat, 19 Aug 2017 01:51:50 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1distj-0003o1-VB for pgsql-performance@postgresql.org; Sat, 19 Aug 2017 01:50:00 +0000 Received: from cat-porwal-prod-mail1.catalyst.net.nz ([2404:130:4080::4]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1distf-0005kL-2d for pgsql-performance@postgresql.org; Sat, 19 Aug 2017 01:49:57 +0000 Received: from localhost (localhost [127.0.0.1]) by cat-porwal-prod-mail1.catalyst.net.nz (Postfix) with ESMTP id 75D06808B6; Sat, 19 Aug 2017 13:49:50 +1200 (NZST) X-Virus-Scanned: Debian amavisd-new at cat-porwal-prod-mail1.servers.catalyst.net.nz Received: from cat-porwal-prod-mail1.catalyst.net.nz ([127.0.0.1]) by localhost (cat-porwal-prod-mail1.servers.catalyst.net.nz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id liEQx-TfHlOr; Sat, 19 Aug 2017 13:49:49 +1200 (NZST) Received: from [IPv6:2406:e001:a50:1:c41f:321b:f4e2:bcc8] (unknown [IPv6:2406:e001:a50:1:c41f:321b:f4e2:bcc8]) (Authenticated sender: mark.kirkwood@catalyst.net.nz) by cat-porwal-prod-mail1.catalyst.net.nz (Postfix) with ESMTPSA id 3278E808B4; Sat, 19 Aug 2017 13:49:49 +1200 (NZST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=catalyst.net.nz; s=catalyst; t=1503107389; bh=f3Mttb/Gljb6LX98bra0weHWEnYDH+Jfc7So5XRYjNk=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=DgyX4SoGxdD54qZZG7DV5Eho4gF0GGkR2oXn9p874eTS/B515JhSuWSTLx5n7ghKb Wck70ArLWgvdVmMvulvOlcxPmWlrMEptlpgbIcihUJB9Jjv0F1JqSaYO3Es1yddGU2 T3BWnqCdyj3AeJnEr3w3qR9qCeOS2h177UH1cVI0= Subject: Re: Odd sudden performance degradation related to temp object churn To: Jeremy Finzel , Scott Marlowe Cc: Jerry Sievers , Peter Geoghegan , postgres performance list References: <87mv71eoep.fsf@jsievers.enova.com> From: Mark Kirkwood Message-ID: Date: Sat, 19 Aug 2017 13:49:49 +1200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org On 19/08/17 02:21, Jeremy Finzel wrote: > On Tue, Aug 15, 2017 at 12:07 PM, Scott Marlowe > > wrote: > > So do iostat or iotop show you if / where your disks are working > hardest? Or is this CPU overhead that's killing performance? > > > Sorry for the delayed reply. I took a look in more detail at the query > plans from our problem query during this incident. There are actually > 6 plans, because there were 6 unique queries. I traced one query > through our logs, and found something really interesting. That is that > all of the first 5 queries are creating temp tables, and all of them > took upwards of 500ms each to run. The final query, however, is a > simple select from the last temp table, and that query took 0.035ms! > This really confirms that somehow, the issue had to do with /writing > /to the SAN, I think. Of course this doesn't answer a whole lot, > because we had no other apparent issues with write performance at all. > > I also provide some graphs below. > > Hi, graphs for latency (or await etc) might be worth looking at too - sometimes the troughs between the IO spikes are actually when the disks have been overwhelmed with queued up pending IOs... Also SANs are notorious for this sort of thing - typically they have a big RAM cache that you are actually writing to, and everything is nice and fast until your workload (along with everyone else's) fills up the cache and then performance drops of a cliff for a while (I've seen SAN disks with iostat utilizations of 105% <-- Lol... and await numbers that scroll off the page in that scenario)! regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance