Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bZLtv-0007rt-Tm for pgsql-performance@arkaria.postgresql.org; Mon, 15 Aug 2016 17:42:16 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1bZLtv-0001np-CT for pgsql-performance@arkaria.postgresql.org; Mon, 15 Aug 2016 17:42:15 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1bZLtu-0001mw-4l for pgsql-performance@postgresql.org; Mon, 15 Aug 2016 17:42:14 +0000 Received: from mail-io0-x233.google.com ([2607:f8b0:4001:c06::233]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1bZLtm-0007WX-8X for pgsql-performance@postgresql.org; Mon, 15 Aug 2016 17:42:13 +0000 Received: by mail-io0-x233.google.com with SMTP id 38so85792847iol.0 for ; Mon, 15 Aug 2016 10:42:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=+iA2bS5laT7Jdl1g0pwt0yr+knOd5a0KzoRPHWm7A/Q=; b=HS1MQll+o3bwd51xGcxSEArpMIHOYvqdG8yKVHFLIcqpks8PvBMAvA2qqm86pmBvhr d82dIL6665SglqTIKMyy1RKJO3JXlxCkf4rf2Nd2QlzR57tifGu2NdWpoFEAVi/Qh9is CV1HRERzYTIO3ZQjmGatjpaQYCsOAaSjjUS4xkCi/v4JZxOHk6SeYaM38L++8VZ9tjhQ Q44I3PWfx3p7fDQMwa+imjNZ2lIZ9e0QRe3ckc3VCLlBH4ZuV+Ua8l+pSvJhFIb34Nqw 3Z0RJXQ7sdc9Tdc+HJn5PEuXITTqFKGikAz4tMRb26YnVKDjXwDYBngMBIqnjXGm6mkF 7qhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=+iA2bS5laT7Jdl1g0pwt0yr+knOd5a0KzoRPHWm7A/Q=; b=Bk9CRZj8/AhnUGjDz6h/Jws+cFtpc43pqV1lsTiANP8n9FRaFIoQlspWRuA9FzcA+5 Hyuv70KbEaglD8Pg4tZWKTgUddHf1uSjrfN7BWjROjwlSJjBTKPbixPOVNpbqqLA2AzX BBf8e6eRxtTirKz0vOpdAA+G25WmQyBVi3N132nQ/kkqu+hQNaUTXDG+Pg8xDJ/5LayJ 1KseVVBSDbp+wkfr87sAt6AxeX8OMVZRJ7kz4/ppEABGyrEPVjrRiFtPZ8jU9+13Ag1L 75BFHVinWl0Lh1I6XRjcyJfTu0ooVYsoo+i1x2OTkl/zLKwmT2IE3T77c2H8gPIR3uIO +Xaw== X-Gm-Message-State: AEkoousvCJ2L+tFYyQrArreQtEdrQJXxKf2a7aS7sNIGKasJwgxI+y8e4u6LUEJSwyor9roHkXnuUyJHjY8Txg== X-Received: by 10.107.31.130 with SMTP id f124mr33892127iof.180.1471282924355; Mon, 15 Aug 2016 10:42:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.19.106 with HTTP; Mon, 15 Aug 2016 10:42:03 -0700 (PDT) In-Reply-To: <20160813185448.GP1179@telsasoft.com> References: <20160813185448.GP1179@telsasoft.com> From: Claudio Freire Date: Mon, 15 Aug 2016 14:42:03 -0300 Message-ID: Subject: Re: index fragmentation on insert-only table with non-unique column To: Justin Pryzby Cc: postgres performance list , Jeff Janes , Tom Lane Content-Type: text/plain; charset=UTF-8 X-Pg-Spam-Score: -2.7 (--) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org On Sat, Aug 13, 2016 at 3:54 PM, Justin Pryzby wrote: > On Sun, Jun 05, 2016 at 12:28:47PM -0700, Jeff Janes wrote: >> On Sun, Jun 5, 2016 at 9:03 AM, Tom Lane wrote: >> > Claudio Freire writes: >> >> So correlated index scans look extra favourable vs bitmap index scans >> >> because bitmap heap scans consider random page costs sans correlation >> >> effects (even though correlation applies to bitmap heap scans as >> >> well). >> > >> > Really? How? The index ordering has nothing to do with the order in >> > which heap tuples will be visited. >> >> It is not the order itself, but the density. >> >> If the index is read in a range scan (as opposed to =ANY scan), and >> the index lead column is correlated with the table ordering, then the >> parts of the table that need to be visited will be much denser than if >> there were no correlation. But Claudio is saying that this is not >> being accounted for. > > I didn't completely understand Claudio/Jeff here, and not sure if we're on the > same page. For queries on these tables, the index scan was very slow, due to > fragmented index on non-unique column, and seq scan would have been (was) > faster (even if it means reading 70GB and filtering out 6 of 7 days' data). > That was resolved by added a nightly reindex job (.. which sometimes competes > with other maintenance and has trouble running every table every night). Yes, but a bitmap index scan should be faster than both, but the planner is discarding it beause it estimates it will be slower, because it doesn't account for correlation between index keys and physical location. And, while what you clarify there would indeed affect the estimation for index scans, it would only make the issue worse: the planner thinks the index scan will be better than it really is, because it's expecting correlation, but the "fragmentation" of same-key runs destroys that correlation. A bitmap index scan would restore it, though, so the bitmap index scan would be that much better. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance