Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dXdVE-0005JK-BT for pgsql-performance@arkaria.postgresql.org; Wed, 19 Jul 2017 01:10:12 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1dXdVD-0006z0-5p for pgsql-performance@arkaria.postgresql.org; Wed, 19 Jul 2017 01:10:11 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dXdTT-0003ql-QS for pgsql-performance@postgresql.org; Wed, 19 Jul 2017 01:08:24 +0000 Received: from mail-qk0-x244.google.com ([2607:f8b0:400d:c09::244]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dXdTM-0008Qe-CW for pgsql-performance@postgresql.org; Wed, 19 Jul 2017 01:08:22 +0000 Received: by mail-qk0-x244.google.com with SMTP id q66so4570267qki.1 for ; Tue, 18 Jul 2017 18:08:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=nv52xhUTBosBe79AHgO4SL4rind6t+yjvIFH+2DEQLM=; b=l8WsNWSK/KNpn+IdtKGaLOtYYMz9U2Ox7BPPkeQIc0IJYpAsfaec14I7VKZsLyB5/E SuELY8VMruky5IEScW8U5CyE41uutkw33gVKc+WUrvRvvNLP5DYrO7abzu/v/BP5TBD8 Pm7HIGO8PdnYwz3FpACCF5NLzgkzFCtyF9a2P3im6maQZMl2zB0CSTzRH00G+GWM3iOR GkPn0H9GxUET9Adb+5qp8Y2YX1PXxD74JXdSIEWU3a27ePMvVBclwxawzsIka13lIGG1 LL1atsdqsddXZ1O+5WB3q+vBrwCz+ZfPJnFrEEU/IUIP/DwsRBN/mTW6T27iLk+BCF8G 7GTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=nv52xhUTBosBe79AHgO4SL4rind6t+yjvIFH+2DEQLM=; b=ojciWRoAMmOwFY5YA3e//nK81nVxmR6A8p3GpYZO0iQ4EoozvPxozGRohFOSzwIjGD hEmmKHEmULRO19H0uLIbpo4zcdE+7HPJO6f50su4Dq+YCzEmU1orNseXRUiX5V0mLPCR fVd2Q67eZZRpJKyXsLF2+WTAWSFB9wrkskWNj5S5ExaMU7ZrEfcfQZ5/hp6/noR0klq4 CSFtPlR+nQUswdjoSsCKvWoVQHtFmKwejTD87K62izrm3mJ6S5eaAGYjo73cgc+Y77Pe kZrRTGhRZgVJUOgqeqpbA90+SPHE9M/8FSFGfJ1RMU2Aw2LYb9W0m+gZ9/5rzaypNFIK x5Ow== X-Gm-Message-State: AIVw111klfgEMiiB5Vbu+Kii8blWPr6yWNlfGrAvp2byR3Avc9O/QYuf C+bVPFRsEfyq2fOSCLtjaabM0wnO+A== X-Received: by 10.55.33.195 with SMTP id f64mr423887qki.208.1500426495327; Tue, 18 Jul 2017 18:08:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.82.7 with HTTP; Tue, 18 Jul 2017 18:08:14 -0700 (PDT) In-Reply-To: References: From: Scott Marlowe Date: Tue, 18 Jul 2017 19:08:14 -0600 Message-ID: Subject: Re: Very poor read performance, query independent To: Charles Nadeau Cc: Claudio Freire , Igor Neyman , Jeff Janes , "pgsql-performance@postgresql.org" Content-Type: text/plain; charset="UTF-8" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org On Tue, Jul 18, 2017 at 3:20 AM, Charles Nadeau wrote: > Claudio, > > Find attached the iostat measured while redoing the query above > (iostat1.txt). sda holds my temp directory (noop i/o scheduler), sdb the > swap partition (cfq i/o scheduler) only and sdc (5 disks RAID0, noop i/o > scheduler) holds the data. I didn't pay attention to the load caused by 12 > parallel scans as I thought the RAID card would be smart enough to > re-arrange the read requests optimally regardless of the load. At one moment > during the query, there is a write storm to the swap drive (a bit like this > case: > https://www.postgresql.org/message-id/AANLkTi%3Diw4fC2RgTxhw0aGpyXANhOT%3DXBnjLU1_v6PdA%40mail.gmail.com). My experience from that case (and few more) has led me to believe that Linux database servers with plenty of memory should have their swaps turned off. The Linux kernel works hard to swap out little used memory to make more space for caching active data. Problem is that whatever decides to swap stuff out gets stupid when presented with 512GB RAM and starts swapping out things like sys v shared_buffers etc. Here's the thing, either your memory is big enough to buffer your whole data set, so nothing should get swapped out to make room for caching. OR your dataset is much bigger than memory. In which case, making more room gets very little if it comes at the cost of waiting for stuff you need to get read back in. Linux servers should also have zone reclaim turned off, and THP disabled. Try running "sudo swapoff -a" and see if it gets rid of your swap storms. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance