Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dXnVb-000093-3k for pgsql-performance@arkaria.postgresql.org; Wed, 19 Jul 2017 11:51:15 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1dXnVa-0007FH-Kd for pgsql-performance@arkaria.postgresql.org; Wed, 19 Jul 2017 11:51:14 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dXnTn-00045j-V8 for pgsql-performance@postgresql.org; Wed, 19 Jul 2017 11:49:24 +0000 Received: from mail-qt0-x235.google.com ([2607:f8b0:400d:c0d::235]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dXnTg-0004qH-DH for pgsql-performance@postgresql.org; Wed, 19 Jul 2017 11:49:22 +0000 Received: by mail-qt0-x235.google.com with SMTP id 32so39334001qtv.1 for ; Wed, 19 Jul 2017 04:49:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=8VXAutXvyZnMR42Csm9EBSB+XVQFDIvEWCYpGL/df3M=; b=VkW5GbhJh7AwpZPpDVFUv8y0iHmT9R9Q4/wvsZlAqajHYb4mR+ThMJVyQFMjH0uRjQ 4nJwFnM29p+upUjDQUcSk7vdFxVTQdkyDl9FgMtKsC6wr+NFzxTaYhunddO+EgS4x0WJ 3chizNmchR2TzP7xiSYG16/6KHfUSk5zMv/zEcUITikOYyB771eeKdXws6UKPbbVCfsB L693wH5B2+SkLexw7tR3lYHmVUFamggOr3qjlpvDZytfa0jEpRiFG2eWHggVEi10mRaC 0qMf3FaqtlxSpMj3YftUjJADW2/X7xIVf2xMPk9nURrdvOqnmEvavlEOj+v2evJBefW8 XSVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=8VXAutXvyZnMR42Csm9EBSB+XVQFDIvEWCYpGL/df3M=; b=gXeqNE/z5UVaelVY9t6jbmOmB2D2MJNg3RV9Q6gpiTMhwm/+qFP714EBwNyLJ6sv7L ZFaRgaS8H5QndTaENecYv6JfxlIFHXzlKsZ619qsgWvxnel5/Lq9ocKo0Sa0Ap24ixkp RDKx4mRnX2XZ7t2RQwtfFuOd8urCyqt9I+FSg/Zj1faBVNQQiEo3H6c0Jt2yITgcOOk9 /xPFBx6sA1mNAAXVA/hXdv9a0QtxSqNdZ7sViHvGAKZefLHe8fE91MMqj9Wx+olRYIvQ ETM7Y/dkl6DdberpaxNMe6xq7xuDO4cpG0ZvI1619b5S7qESYF+nHU+j6ykzpmgDisi4 GveA== X-Gm-Message-State: AIVw1108xrTGCR6NMhAaJuC39ZBiwHKn/wX92rD920l6V/YB1PRpEO9a psMYOaZ7NsA1DF6VqVbcNX2GO4SFzQ== X-Received: by 10.200.46.194 with SMTP id i2mr2755118qta.89.1500464955436; Wed, 19 Jul 2017 04:49:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.157.11 with HTTP; Wed, 19 Jul 2017 04:48:54 -0700 (PDT) In-Reply-To: <20170718180152.GE17566@telsasoft.com> References: <20170718180152.GE17566@telsasoft.com> From: Charles Nadeau Date: Wed, 19 Jul 2017 13:48:54 +0200 Message-ID: Subject: Re: Very poor read performance, query independent To: Justin Pryzby Cc: Claudio Freire , Igor Neyman , Jeff Janes , "pgsql-performance@postgresql.org" Content-Type: multipart/alternative; boundary="001a11403880aab3920554aa382a" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a11403880aab3920554aa382a Content-Type: text/plain; charset="UTF-8" Justin, Thanks for the extensive reading list, very educative. After reading https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/ I was thinking that it could be a NUMA/THP-related problem. Turning off THP solved the "swap storm" problem. Some queries are even 40% faster with THP off. Then also turning off KSM improved performance by another 5% I was seriously worried about this issue as we received today another server with 144GB of RAM. I will try to post a little summary of all the suggestion I received via this thread later this week/early next week. Thanks! Charles On Tue, Jul 18, 2017 at 8:01 PM, Justin Pryzby wrote: > On Tue, Jul 18, 2017 at 02:13:58PM -0300, Claudio Freire wrote: > > On Tue, Jul 18, 2017 at 1:01 PM, Claudio Freire > wrote: > > > On Tue, Jul 18, 2017 at 6:20 AM, Charles Nadeau > > > wrote: > > >> Claudio, > > >> > > >> At one moment > > >> during the query, there is a write storm to the swap drive (a bit > like this > > >> case: > > >> https://www.postgresql.org/message-id/AANLkTi% > 3Diw4fC2RgTxhw0aGpyXANhOT%3DXBnjLU1_v6PdA%40mail.gmail.com). > > >> I can hardly explain it as there is plenty of memory on this server. > > > > > > That sounds a lot like NUMA zone_reclaim issues: > > > > > > https://www.postgresql.org/message-id/500616CB.3070408@2ndQuadrant.com > > > > I realize you have zone_reclaim_mode set to 0. Still, the symptoms are > > eerily similar. > > Did you look at disabling KSM and/or THP ? > > sudo sh -c 'echo 2 >/sys/kernel/mm/ksm/run' > > https://www.postgresql.org/message-id/20170524155855. > GH31097%40telsasoft.com > https://www.postgresql.org/message-id/CANQNgOrD02f8mR3Y8Pi= > zFsoL14RqNQA8hwz1r4rSnDLr1b2Cw@mail.gmail.com > https://www.postgresql.org/message-id/CAHyXU0y9hviyKWvQZxX5UWfH9M2LY > vwvAOPQ_DUPva2b71t12g%40mail.gmail.com > https://www.postgresql.org/message-id/20130716195834. > 8fe5c79249cb2ff0d4270b3e@yahoo.es > https://www.postgresql.org/message-id/CAE_gQfW3dBiELcOppYN6v%3D8%2B% > 2BpEeywD7iXGw-OT3doB8SXO4_A%40mail.gmail.com > https://www.postgresql.org/message-id/flat/1436268563235- > 5856914.post%40n5.nabble.com#1436268563235-5856914.post@n5.nabble.com > https://www.postgresql.org/message-id/CAL_0b1tJOZCx3Lo3Eve1RqGaT%2BJJ_ > Q7w4pkJ87WfWwXbTugnxw@mail.gmail.com > https://www.postgresql.org/message-id/556E2068.7070007@vuole.me > https://www.postgresql.org/message-id/1415981309.90631. > YahooMailNeo%40web133205.mail.ir2.yahoo.com > https://www.postgresql.org/message-id/CAHyXU0yXYpCXN4%3D81ZDRQu- > oGzrcq2qNAXDpyz4oiQPPAGk4ew%40mail.gmail.com > https://www.pythian.com/blog/performance-tuning-hugepages-in-linux/ > http://structureddata.org/2012/06/18/linux-6-transparent-huge-pages-and- > hadoop-workloads/ > > Justin > -- Charles Nadeau Ph.D. http://charlesnadeau.blogspot.com/ --001a11403880aab3920554aa382a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Justin,

Thanks for the exten= sive reading list, very educative.

After reading <= a href=3D"https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa= -architecture/">https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-th= e-numa-architecture/ I was thinking that it could be a NUMA/THP-related= problem.
Turning off THP solved the "swap storm" probl= em. Some queries are even 40% faster with THP off.
Then also turn= ing off KSM improved performance by another 5%
I was seriously wo= rried about this issue as we received today another server with 144GB of RA= M.

I will try to post a little summary of all the = suggestion I received via this thread later this week/early next week.

Thanks!

Charles

On Tue, Jul 18, 201= 7 at 8:01 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:=
= On Tue, Jul 18, 2017 at 02:13:58PM -0300, Claudio Freire wrote:
> On Tue, Jul 18, 2017 at 1:01 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> > On Tue, Jul 18, 2017 at 6:20 AM, Charles Nadeau
> > <charles.nadeau@gm= ail.com> wrote:
> >> Claudio,
> >>
> >> At one moment
> >> during the query, there is a write storm to the swap drive (a= bit like this
> >> case:
> >> https://www.postgresql.org/message-id/AANLkTi%3= Diw4fC2RgTxhw0aGpyXANhOT%3DXBnjLU1_v6PdA%40mail.gmail.com). > >> I can hardly explain it as there is plenty of memory on this = server.
> >
> > That sounds a lot like NUMA zone_reclaim issues:
> >
> > https://www.postgres= ql.org/message-id/500616CB.3070408@2ndQuadrant.com
>
> I realize you have zone_reclaim_mode set to 0. Still, the symptoms are=
> eerily similar.

Did you look at disabling KSM and/or THP ?

sudo sh -c 'echo 2 >/sys/kernel/mm/ksm/run'

https://www.postgresql.o= rg/message-id/20170524155855.GH31097%40telsasoft.com
https://www.postgresql.org/message-id/CANQNgOrD02f8mR3Y8Pi= =3DzFsoL14RqNQA8hwz1r4rSnDLr1b2Cw@mail.gmail.com
https://www.postgresql.org/message-id/CAHyXU0y9hviyKWvQZxX5= UWfH9M2LYvwvAOPQ_DUPva2b71t12g%40mail.gmail.com
https://www.po= stgresql.org/message-id/20130716195834.8fe5c79249cb2ff0d4270b3e@<= wbr>yahoo.es
https://www.postgresql.org/message-id/CAE_gQfW3dBiELc= OppYN6v%3D8%2B%2BpEeywD7iXGw-OT3doB8SXO4_A%40mail.gmail.com https://www.postgresql.org/message-id/flat/1= 436268563235-5856914.post%40n5.nabble.com#1436268563235-5856914.p= ost@n5.nabble.com
https://www.postgresql.org/message-id/CAL_0b1tJOZCx3Lo3Eve1= RqGaT%2BJJ_Q7w4pkJ87WfWwXbTugnxw@mail.gmail.com
https://www.postgresql.org/messa= ge-id/556E2068.7070007@vuole.me
ht= tps://www.postgresql.org/message-id/1415981309.90631.YahooMailNeo= %40web133205.mail.ir2.yahoo.com
https://www.postgresql.org/message-id/CAHyXU0yXYpCXN4%3D8= 1ZDRQu-oGzrcq2qNAXDpyz4oiQPPAGk4ew%40mail.gmail.com
https://www.pythian.com/blog/performance-tuning-hugepages-in-linux/
http://stru= ctureddata.org/2012/06/18/linux-6-transparent-huge-pages-and-hadoop-workloads/

Justin



--
--001a11403880aab3920554aa382a--