Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dV5iz-000410-L3 for pgsql-performance@arkaria.postgresql.org; Wed, 12 Jul 2017 00:41:53 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1dV5ix-0005wz-Aj for pgsql-performance@arkaria.postgresql.org; Wed, 12 Jul 2017 00:41:51 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dV5hA-0002oQ-Nb for pgsql-performance@postgresql.org; Wed, 12 Jul 2017 00:40:01 +0000 Received: from mail-vk0-x236.google.com ([2607:f8b0:400c:c05::236]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dV5h7-0000Ik-8E for pgsql-performance@postgresql.org; Wed, 12 Jul 2017 00:39:59 +0000 Received: by mail-vk0-x236.google.com with SMTP id r126so4656890vkg.0 for ; Tue, 11 Jul 2017 17:39:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=icZmk10RrqczSjCUaubLi4Cv3dXx54YQfY7mv8t8B9Y=; b=JNC337gJDne26b9MUMxz6tQxYq4NllWUK4vN98fgjvV+4KooKqOZGq/K54/8M1XrEP JG69Lpw7E7gjNVRuJvBfXrBJdPMmI++0FY77Thq5i6Tbt7G33IOiHaKa7p29kEjIKqMU GrOgHnfo+UVkUcSSGmaYpnd3RgRcT+hZ12EW4mduoflPJT7sl4uOEhp6JbVUTL6k0wFx GK44z8F9WpaKov1HLJxBHP2VtM/NpF1jiBjdakXaerrDFinsstp7ov85dSzQt1VMohtq wQv5GFxpK+3XKi5Y/JqVsBcOAiOihqwU04GubYeFolKaqBbpYPv7MoVIKQUdn7nv+sLH EA/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=icZmk10RrqczSjCUaubLi4Cv3dXx54YQfY7mv8t8B9Y=; b=sIDmsiZC+cfAeI3QXlDaXjJUTEziN5VmulGo7J7rIhoud/1rK+RHtH0rfec6PreZBu 8cEprpDXjdW0WXkT6rRc1GTJZFZNqREAkFykscvRRMLl9DtuncK0cdf9dKbOODZaKA37 A3LbLCXNuhMQZxa0ulwi38nKMMNELx4xmeCFBCY1lCsHLEXp94p4vK2YAU1hOWBo+/OG /wzFVgwYJQ+J8SYfCIomcyRFoWsqtavYW+CGRLu0kQAA1AonZYSvkodVVw4PLzkB0ZbH 1p3bGwSi26RKNdsm4prKxAJsuhQGpPuBHSono66VMg43A3h8QcKCKFBi/WW9s6xYAfyF dLrg== X-Gm-Message-State: AIVw110jnyHkAFS8wrEzutU/MYQfv0A/lGow5S9qZvwIekGEXYIBex5C gx1/1mompLWJPk+9tHNDKGiwChMyGg== X-Received: by 10.31.21.66 with SMTP id 63mr162266vkv.61.1499819995527; Tue, 11 Jul 2017 17:39:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.159.56.201 with HTTP; Tue, 11 Jul 2017 17:39:54 -0700 (PDT) In-Reply-To: References: From: Jeff Janes Date: Tue, 11 Jul 2017 17:39:54 -0700 Message-ID: Subject: Re: Very poor read performance, query independent To: Charles Nadeau Cc: "pgsql-performance@postgresql.org" Content-Type: multipart/alternative; boundary="001a1143aed40f7db60554140ea4" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a1143aed40f7db60554140ea4 Content-Type: text/plain; charset="UTF-8" On Tue, Jul 11, 2017 at 4:02 AM, Charles Nadeau wrote: > Jeff, > > I used fio in a quick benchmarking script inspired by https://smcleod.net/ > benchmarking-io/: > > #!/bin/bash > #Random throughput > echo "Random throughput" > sync > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 > --name=test --filename=test --bs=4M --iodepth=256 --size=10G > --readwrite=randread --ramp_time=4 > #Random IOPS > echo "Random IOPS" > sync > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 > --name=test --filename=test --bs=4k --iodepth=256 --size=4G > --readwrite=randread --ramp_time=4 > #Sequential throughput > echo "Sequential throughput" > sync > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 > --name=test --filename=test --bs=4M --iodepth=256 --size=10G > --readwrite=read --ramp_time=4 > #Sequential IOPS > echo "Sequential IOPS" > sync > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 > --name=test --filename=test --bs=4k --iodepth=256 --size=4G > --readwrite=read --ramp_time=4 > > I don't think any of those are directly relevant to PostgreSQL, as it doesn't use direct IO, doesn't use libaio, and is rarely going to get anywhere near 256 iodepth. So the best they can do is put a theoretical ceiling on the performance. Also, random IO with a 4MB stride doesn't make any sense from a PostgreSQL perspective. > > Performing the test you suggested, I get 128.5MB/s. Monitoring the test, I > find that the throughput is constant from start to finish and that the > iowait is also constant at 5%: > I would have expected it to do better than that. Maybe you increase the kernel readahead setting. I've found the default to be much too small. But it doesn't make much difference to you, as you appear to be doing random IO in your queries, not sequential. > Could you suggest another way to benchmark random reads? > Your 1100 IOPS times 8kb block size gives about 8MB/s of throughput, which is close to what you report. So I think I'd would instead focus on tuning your actual queries. You say the problem is not query-dependent, but I think that that just means all the queries you looked at are similar. If you looked at a query that can't use indexes, like count(unindexed_column) from biggest_table; you would find it doing much more IO than 4MB/s. Can you pick the simplest query you actually care about, and post both an "explain (analyze, timing off)" and an "explain (analyze, buffers)" for it? (Preferably turning "track_io_timing" on first). One other question I had, you said you had "2x Intel Xeon E5550", which should be 8 CPU (or 16, if the hyperthreads are reported as separate CPUs). But you also said: "Also using dstat I can see that iowait time is at about 25%". Usually if there is only one thing going on on the server, then IOWAIT won't be more than reciprocal of #CPU. Is the server busy doing other stuff at the same time you are benchmarking it? Cheers, Jeff --001a1143aed40f7db60554140ea4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On T= ue, Jul 11, 2017 at 4:02 AM, Charles Nadeau <charles.nadeau@gmail= .com> wrote:
Jeff,

I used fio in a q= uick benchmarking script inspired by https://smcleod.net/benchmarking-io/:=
#!/bin/bash
#Random throughput
echo "Random = throughput"
sync
fio --randrepeat=3D1 --ioengine= =3Dlibaio --direct=3D1 --gtod_reduce=3D1 --name=3Dtest --filename=3Dtest --= bs=3D4M --iodepth=3D256 --size=3D10G --readwrite=3Drandread --ramp_time=3D4=
#Random IOPS
echo "Random IOPS"
sy= nc
fio --randrepeat=3D1 --ioengine=3Dlibaio --direct=3D1 --gtod_r= educe=3D1 --name=3Dtest --filename=3Dtest --bs=3D4k --iodepth=3D256 --size= =3D4G --readwrite=3Drandread --ramp_time=3D4
#Sequential throughp= ut
echo "Sequential throughput"
sync
fio --randrepeat=3D1 --ioengine=3Dlibaio --direct=3D1 --gtod_reduce=3D1 -= -name=3Dtest --filename=3Dtest --bs=3D4M --iodepth=3D256 --size=3D10G --rea= dwrite=3Dread --ramp_time=3D4
#Sequential IOPS
echo &qu= ot;Sequential IOPS"
sync
fio --randrepeat=3D1 --io= engine=3Dlibaio --direct=3D1 --gtod_reduce=3D1 --name=3Dtest --filename=3Dt= est --bs=3D4k --iodepth=3D256 --size=3D4G --readwrite=3Dread --ramp_time=3D= 4

I don't thin= k any of those are directly relevant to PostgreSQL, as it doesn't use d= irect IO, doesn't use libaio, and is rarely going to get anywhere near = 256 iodepth.=C2=A0 So the best they can do is put a theoretical ceiling on = the performance.=C2=A0 Also, random IO with a 4MB stride doesn't make a= ny sense from a PostgreSQL perspective.

=C2=A0

Performing the test you suggested, I get 128.5MB/s. M= onitoring the test, I find that the throughput is constant from start to fi= nish and that the iowait is also constant at 5%:

I would have expected it to do better than that.=C2=A0 Ma= ybe you increase the kernel readahead setting.=C2=A0 I've found the def= ault to be much too small.=C2=A0 But it doesn't make much difference to= you, as you appear to be doing random IO in your queries, not sequential.<= /div>


Could you suggest another way to benchmark = random reads?

Your 1100 IOPS ti= mes 8kb block size gives about 8MB/s of throughput, which is close to what = you report.=C2=A0 So I think I'd would instead focus on tuning your act= ual queries.=C2=A0 You say the problem is not query-dependent, but I think = that that just means all the queries you looked at are similar.=C2=A0 If yo= u looked at a query that can't use indexes, like count(unindexed_column= ) from biggest_table; you would find it doing much more IO than 4MB/s.

Can you pick the simplest query you actually care abou= t, and post both an "explain (analyze, timing off)" and an "= explain (analyze, buffers)" for it? =C2=A0(Preferably turning "tr= ack_io_timing" on first).

One other question = I had, you said you had "2x Intel Xeon E5550", which should be 8 CPU (or 16, if the hypert= hreads=C2=A0
are reported as separate CPUs).=C2=A0 But you also said: "Also using dstat I can se= e that iowait time is at about 25%". =C2=A0Usually if there is only one thing goi= ng on on the server, then IOWAIT won't be more than reciprocal=C2=A0of = #CPU.=C2=A0 Is the server busy doing other stuff at the same time you are b= enchmarking it?

= Cheers,

Jeff
--001a1143aed40f7db60554140ea4--