Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1diiBR-000326-UP for pgsql-performance@arkaria.postgresql.org; Fri, 18 Aug 2017 14:23:34 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1diiBQ-0000Kw-Si for pgsql-performance@arkaria.postgresql.org; Fri, 18 Aug 2017 14:23:32 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dii9e-0005fi-FC for pgsql-performance@postgresql.org; Fri, 18 Aug 2017 14:21:42 +0000 Received: from mail-wm0-x22f.google.com ([2a00:1450:400c:c09::22f]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dii9b-0007H7-5J for pgsql-performance@postgresql.org; Fri, 18 Aug 2017 14:21:41 +0000 Received: by mail-wm0-x22f.google.com with SMTP id t201so1108579wmt.1 for ; Fri, 18 Aug 2017 07:21:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=z7HXB25aXyoQ8m8UD2L9J6yYQAklJ/RSlTLjbV8X+v4=; b=DuaBNDzsK3wrCn4+eHh3l1SarpaYPkv0xnI3htrTocmQXeWKP2ydN+hxfSWLUVl4ej FD5Pg7KZZ4hk26ORH3lIdTzY9KNCvbt8I94vZ+tpOJ3SIazgicgW1TB+/Fh8F/+CPTBq g4UoV60j7Pb9G4IuhWQz1E9sMvUBplYSdKV1PXCuJN0DIT2uDjZqBZSv0j3kmT0f7i02 Dew1zHit4YeTDjMAl8zbcjxjP8cbcBcLYZ9HJDWnkNE175tR9LEBEshbvQqWvDuFetGu SsKeGf7v2DJ4E3Jvt1ba2hExbDl+DZvVoDre5R2eU5AT1RGZmUG9jWuOyI8cMCsOQBzE sIVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=z7HXB25aXyoQ8m8UD2L9J6yYQAklJ/RSlTLjbV8X+v4=; b=AveaVfZ5yAa1CDpCuHsWSU/WUolPVA8lYSbkHZcSZQJwHocpFJQdygb0Ken4mbxHYa 0p3p5iJacEvepjYOf3b5q/3b8v5BhnCEh4vYTjIE3FmWmqNwUZm4Sd0k/r2M4V5LgfK4 Rfv/5RN+jHbm/klHYMGUzlu9akbl/Dhf5qJEs6M0rg+N23QROJ9N2EXJrzdYTTiBp3Qe IBK4tqI2hXSSDcePvVLpqq3qRkLoL9oZ7UJ1p9HSjss+VqojH8OddZ9/GAhh00XJDT0h ACWc5hAD60Ixh33VO2+A/s6dDuYP9Zg9QABcFikoRO0w92zispG1Z1tGFGCyBWJWzJxC PhqQ== X-Gm-Message-State: AHYfb5iOB0xBYdPtAntB8xx5q00NkoPFPuckTcHSxDLjT/O1zFjtNbzp B9rm/tPWP/t1MktercthWZ+OZipGzQ== X-Received: by 10.28.131.193 with SMTP id f184mr1483870wmd.117.1503066097260; Fri, 18 Aug 2017 07:21:37 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.155.1 with HTTP; Fri, 18 Aug 2017 07:21:36 -0700 (PDT) In-Reply-To: References: <87mv71eoep.fsf@jsievers.enova.com> From: Jeremy Finzel Date: Fri, 18 Aug 2017 09:21:36 -0500 Message-ID: Subject: Re: Odd sudden performance degradation related to temp object churn To: Scott Marlowe Cc: Jerry Sievers , Peter Geoghegan , postgres performance list Content-Type: multipart/alternative; boundary="001a1144226ecd2095055707d88a" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a1144226ecd2095055707d88a Content-Type: text/plain; charset="UTF-8" On Tue, Aug 15, 2017 at 12:07 PM, Scott Marlowe wrote: > So do iostat or iotop show you if / where your disks are working > hardest? Or is this CPU overhead that's killing performance? > Sorry for the delayed reply. I took a look in more detail at the query plans from our problem query during this incident. There are actually 6 plans, because there were 6 unique queries. I traced one query through our logs, and found something really interesting. That is that all of the first 5 queries are creating temp tables, and all of them took upwards of 500ms each to run. The final query, however, is a simple select from the last temp table, and that query took 0.035ms! This really confirms that somehow, the issue had to do with *writing *to the SAN, I think. Of course this doesn't answer a whole lot, because we had no other apparent issues with write performance at all. I also provide some graphs below. 7pm-3am on 8/10 (first incidents were around 10:30pm, other incidents ~1am, 2am): Local Disk IO: [image: Screen Shot 2017-08-18 at 8.20.06 AM.png] SAN IO: [image: Screen Shot 2017-08-18 at 8.16.59 AM.png] CPU: [image: Screen Shot 2017-08-18 at 8.20.58 AM.png] 7-9pm on 8/10 (controlled attempts starting a little after 7): CPU: [image: Screen Shot 2017-08-18 at 8.43.35 AM.png] Write IO on SAN: [image: Screen Shot 2017-08-18 at 8.44.32 AM.png] Read IO on Local disk: [image: Screen Shot 2017-08-18 at 8.46.27 AM.png] Write IO on Local disk: [image: Screen Shot 2017-08-18 at 8.46.58 AM.png] --001a1144226ecd2095055707d88a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On T= ue, Aug 15, 2017 at 12:07 PM, Scott Marlowe <scott.marlowe@gmail.c= om> wrote:
So do iostat or iotop show you if / where your disks are working
hardest? Or is this CPU overhead that's killing performance?

Sorry for the delayed reply. =C2=A0I took a look in more detail at the query plans from our problem que= ry during this incident.=C2=A0 There are actually 6 plans, because there we= re 6 unique queries.=C2=A0 I traced one query through our logs, and found s= omething really interesting. =C2=A0= That is that all of the first 5 queries are creating temp tables, and all o= f them took upwards of 500ms each to run.=C2=A0 The final query, however, i= s a simple select from the last temp table, and that query took 0.035ms!=C2= =A0 This really confirms that somehow, the issue had to do with=C2=A0writing=C2=A0to the SAN, I think.=C2=A0 Of course this doesn'= t answer a whole lot, because we had no other apparent issues with write pe= rformance at all.

I also provide some graphs below.

7pm-3am on 8/10 (first inciden= ts were around 10:30pm, other incidents ~1am, 2am):

Local Disk IO:=

3D"Screen


SAN IO:

3D"Screen


CPU:


7-9pm on 8/10 (contr= olled attempts starting a little after 7):

CPU:

3D"Screen

Write IO = on SAN:


3D"Screen


Read IO on Local disk:

3D"Screen

Write IO= on Local disk:

3D"Screen


--001a1144226ecd2095055707d88a--