Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b8yu4-00048O-BK for pgsql-performance@arkaria.postgresql.org; Fri, 03 Jun 2016 23:53:24 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1b8yu3-0005Xb-6K for pgsql-performance@arkaria.postgresql.org; Fri, 03 Jun 2016 23:53:23 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1b8yu2-0005XS-Jf for pgsql-performance@postgresql.org; Fri, 03 Jun 2016 23:53:22 +0000 Received: from mail-io0-x22a.google.com ([2607:f8b0:4001:c06::22a]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1b8ytt-0000w3-7b for pgsql-performance@postgresql.org; Fri, 03 Jun 2016 23:53:21 +0000 Received: by mail-io0-x22a.google.com with SMTP id t40so95109979ioi.0 for ; Fri, 03 Jun 2016 16:53:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telsasoft-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=CpHP3Jti8pW/LTGELKy1oPdDVKLJDxMpRPYwuzJMowY=; b=sI+XVW4Klp7kSGjp0U2tttbsXcTpwwSmIt1gU29tEFNlSMufrAQlz3d8cEUwHSGTTX 2TpyzNrRjRgl/a5PBCjdRxpkWSVFMrIqHqGUGrWtieufDPF2YmdZ6yZcpvFPX7Av6eHM 5ask0KR6N+oDnIcTl1pd829IkI73b2S+V4zWrFpFTjOmvGqqQjQY26DNmnR/3jqONCj/ 2P2QEqeLHc1oPCPcM+efB05pmoyS6+q+LDXFQM3b86OU7glWOhwQj9vOwvwDOZ9Tbedu hDU0w1z24mgIQw1USc2RIlm2c3XgkqJGcPw0+rqZGLYilCzRxIsIkbyPFnJFeWQgUpQq +A1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=CpHP3Jti8pW/LTGELKy1oPdDVKLJDxMpRPYwuzJMowY=; b=Mp6+QnyDbXqAkKaRmItgi9oOnT8+/Z42etarNNJpJX7AEHDM7GTjgH1duzVFZq9oLw muVwSu33SQ5J5rV13BBX2YKg3gtYZf8+oklA/73ehu6AdXaei88bGTySM3gsgsTE8XkI bCDfTSKKtCK+2wfCqb/Y4htNjxDGSGR0egExNz16jNH0VTGFwv5ZTZrtzjv0vwPJyXvp iKOJeZS03ykNlUM+/LlnWAyf31pIZv0lNBP6WXVryNuRvd48hkCRzMuGTncuJnMrYSuB BxyqkyblstK5KfWYEBzXbNbWsDsz2ki2bIHEVHuqCLFdgmGvWDj/KMU8wtZiC1Pg+gtF LVdA== X-Gm-Message-State: ALyK8tIGUJwv6ZxC7qEllkfsrIrD449EvO4DdbNoZs8Ay7kJusulxun3fC0z+gKdbQWM9Q== X-Received: by 10.107.173.218 with SMTP id m87mr9516337ioo.131.1464997991148; Fri, 03 Jun 2016 16:53:11 -0700 (PDT) Received: from pryzbyj (charmander.telsasoft.com. [50.244.222.1]) by smtp.gmail.com with ESMTPSA id e143sm874144ita.22.2016.06.03.16.53.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Jun 2016 16:53:10 -0700 (PDT) Received: by pryzbyj (Postfix, from userid 1000) id A215E800FD8; Fri, 3 Jun 2016 18:54:06 -0500 (CDT) Date: Fri, 3 Jun 2016 18:54:06 -0500 From: Justin Pryzby To: pgsql-performance@postgresql.org Subject: Re: index fragmentation on insert-only table with non-unique column Message-ID: <20160603235406.GN23616@telsasoft.com> References: <20160524173914.GA11880@telsasoft.com> <20160525140034.GB21220@telsasoft.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Pg-Spam-Score: -2.6 (--) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org On Fri, Jun 03, 2016 at 06:26:33PM -0300, Claudio Freire wrote: > On Wed, May 25, 2016 at 11:00 AM, Justin Pryzby wrote: > >> > First, I found I was able to get 30-50min query results on full week's table by > >> > prefering a seq scan to an index scan. The row estimates seemed fine, and the > >> > only condition is the timestamp, so the planner's use of index scan is as > >> > expected. > >> > >> Can you show us the query? I would expect a bitmap scan of the index > >> (which would do what you want, but even more so), instead. > > See explain, also showing additional tables/views being joined. It's NOT doing > > a bitmap scan though, and I'd be interested to find why; I'm sure that would've > > improved this query enough so it never would've been an issue. > > https://explain.depesz.com/s/s8KP > > > > -> Index Scan using cdrs_huawei_pgwrecord_2016_05_01_recordopeningtime_idx on cdrs_huawei_pgwrecord_2016_05_01 (cost=0.56..1601734.57 rows=8943848 width=349) > > Index Cond: ((recordopeningtime >= '2016-05-07 00:00:00'::timestamp without time zone) AND (recordopeningtime < '2016-05-08 00:00:00'::timestamp without time zone)) > > Please show your guc settings ( see > https://wiki.postgresql.org/wiki/Server_Configuration ) > > A plan node like that, if it would result in I/O, with proper > configuration should have selected a bitmap index/heap scan. If it > didn't, it probably thinks it has more cache than it really does, and > that would mean the wrong setting was set in effective_cache_size. ts=# SELECT name, current_setting(name), SOURCE FROM pg_settings WHERE SOURCE='configuration file'; dynamic_shared_memory_type | posix | configuration file effective_cache_size | 64GB | configuration file effective_io_concurrency | 8 | configuration file huge_pages | try | configuration file log_autovacuum_min_duration | 0 | configuration file log_checkpoints | on | configuration file maintenance_work_mem | 6GB | configuration file max_connections | 200 | configuration file max_wal_size | 4GB | configuration file min_wal_size | 6GB | configuration file shared_buffers | 8GB | configuration file wal_compression | on | configuration file work_mem | 1GB | configuration file I changed at least maintenance_work_mem since I originally wrote, to try to avoid tempfiles during REINDEX (though I'm not sure it matters, as the tempfiles are effective cached and may never actually be written). It's entirely possible those settings aren't ideal. The server has 72GB RAM. There are usually very few (typically n<3 but at most a handful) nontrivial queries running at once, if at all. I wouldn't expect any data that's not recent (table data last 2 days or index from this month) to be cached, and wouldn't expect that to be entirely cached, either: ts=# SELECT sum(pg_table_size(oid))/1024^3 gb FROM pg_class WHERE relname~'_2016_05_..$'; gb | 425.783050537109 ts=# SELECT sum(pg_table_size(oid))/1024^3 gb FROM pg_class WHERE relname~'_2016_05_...*idx'; gb | 60.0909423828125 ts=# SELECT sum(pg_table_size(oid))/1024^3 gb FROM pg_class WHERE relname~'_201605.*idx'; gb | 4.85528564453125 ts=# SELECT sum(pg_table_size(oid))/1024^3 gb FROM pg_class WHERE relname~'_201605$'; gb | 86.8688049316406 As a test, I did SET effective_cache_size='1MB', before running explain, and still does: | -> Index Scan using cdrs_huawei_pgwrecord_2016_05_29_recordopeningtime_idx on cdrs_huawei_pgwrecord_2016_05_29 (cost=0.44..1526689.49 rows=8342796 width=355) | Index Cond: ((recordopeningtime >= '2016-05-29 00:00:00'::timestamp without time zone) AND (recordopeningtime < '2016-05-30 00:00:00'::timestamp without time zone)) I Set enable_indexscan=0, and got: | -> Bitmap Heap Scan on cdrs_huawei_pgwrecord_2016_05_29 (cost=168006.10..4087526.04 rows=8342796 width=355) | Recheck Cond: ((recordopeningtime >= '2016-05-29 00:00:00'::timestamp without time zone) AND (recordopeningtime < '2016-05-30 00:00:00'::timestamp without time zone)) | -> Bitmap Index Scan on cdrs_huawei_pgwrecord_2016_05_29_recordopeningtime_idx (cost=0.00..165920.40 rows=8342796 width=0) | Index Cond: ((recordopeningtime >= '2016-05-29 00:00:00'::timestamp without time zone) AND (recordopeningtime < '2016-05-30 00:00:00'::timestamp without time zone)) Here's a minimal query which seems to isolate the symptom: ts=# explain (analyze,buffers) SELECT sum(duration) FROM cdrs_huawei_pgwrecord_2016_05_22 WHERE recordopeningtime>='2016-05-22' AND recordopeningtime<'2016-05-23'; | Aggregate (cost=2888731.67..2888731.68 rows=1 width=8) (actual time=388661.892..388661.892 rows=1 loops=1) | Buffers: shared hit=4058501 read=1295147 written=35800 | -> Index Scan using cdrs_huawei_pgwrecord_2016_05_22_recordopeningtime_idx on cdrs_huawei_pgwrecord_2016_05_22 (cost=0.56..2867075.33 rows=8662534 w |idth=8) (actual time=0.036..379332.910 rows=8575673 loops=1) | Index Cond: ((recordopeningtime >= '2016-05-22 00:00:00'::timestamp without time zone) AND (recordopeningtime < '2016-05-23 00:00:00'::timestamp | without time zone)) | Buffers: shared hit=4058501 read=1295147 written=35800 | Planning time: 0.338 ms | Execution time: 388661.947 ms And here's an older one to avoid cache, with enable_indexscan=0 |ts=# explain (analyze,buffers) SELECT sum(duration) FROM cdrs_huawei_pgwrecord_2016_05_08 WHERE recordopeningtime>='2016-05-08' AND recordopeningtime<'2016-05-09'; | Aggregate (cost=10006286.58..10006286.59 rows=1 width=8) (actual time=44219.156..44219.156 rows=1 loops=1) | Buffers: shared hit=118 read=1213887 written=50113 | -> Bitmap Heap Scan on cdrs_huawei_pgwrecord_2016_05_08 (cost=85142.24..9985848.96 rows=8175048 width=8) (actual time=708.024..40106.062 rows=8179338 loops=1) | Recheck Cond: ((recordopeningtime >= '2016-05-08 00:00:00'::timestamp without time zone) AND (recordopeningtime < '2016-05-09 00:00:00'::timestamp without time zone)) | Rows Removed by Index Recheck: 74909 | Heap Blocks: lossy=1213568 | Buffers: shared hit=118 read=1213887 written=50113 | -> Bitmap Index Scan on cdrs_huawei_pgwrecord_2016_05_08_recordopeningtime_idx1 (cost=0.00..83098.48 rows=8175048 width=0) (actual time=706.557..706.557 rows=12135680 loops=1) | Index Cond: ((recordopeningtime >= '2016-05-08 00:00:00'::timestamp without time zone) AND (recordopeningtime < '2016-05-09 00:00:00'::timestamp without time zone)) | Buffers: shared hit=117 read=320 | Planning time: 214.786 ms | Execution time: 44228.874 ms |(12 rows) Thanks for your help. Justin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance