Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b5Qf5-0001KS-NZ for pgsql-performance@arkaria.postgresql.org; Wed, 25 May 2016 04:43:15 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1b5Qf4-0005Ur-PP for pgsql-performance@arkaria.postgresql.org; Wed, 25 May 2016 04:43:14 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1b5Qf2-0005UT-Qu for pgsql-performance@postgresql.org; Wed, 25 May 2016 04:43:12 +0000 Received: from sss.pgh.pa.us ([66.207.139.130]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1b5Qf0-0000Ha-9M for pgsql-performance@postgresql.org; Wed, 25 May 2016 04:43:11 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.14.4/8.14.4) with ESMTP id u4P4h85H004118; Wed, 25 May 2016 00:43:08 -0400 From: Tom Lane To: Peter Geoghegan cc: Justin Pryzby , pgsql-performance@postgresql.org Subject: Re: index fragmentation on insert-only table with non-unique column In-reply-to: References: <20160524173914.GA11880@telsasoft.com> Comments: In-reply-to Peter Geoghegan message dated "Tue, 24 May 2016 21:16:20 -0700" Date: Wed, 25 May 2016 00:43:08 -0400 Message-ID: <4117.1464151388@sss.pgh.pa.us> X-Pg-Spam-Score: -3.3 (---) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org Peter Geoghegan writes: > The basic problem is that the B-Tree code doesn't maintain this > property. However, B-Tree index builds will create an index that > initially has this property, because the tuplesort.c code happens to > sort index tuples with a CTID tie-breaker. Yeah. I wonder what would happen if we used the same rule for index insertions. It would likely make insertions more expensive, but maybe not by much. The existing "randomization" rule for where to insert new items in a run of identical index entries would go away, because the insertion point would become deterministic. I am not sure if that's good or bad for insertion performance, but it would likely help for scan performance. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance