X-Original-To: pgsql-docs-postgresql.org@localhost.postgresql.org Received: from localhost (av.hub.org [200.46.204.144]) by svr1.postgresql.org (Postfix) with ESMTP id C50D3D9241 for ; Thu, 20 Oct 2005 21:35:05 -0300 (ADT) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 00820-04 for ; Fri, 21 Oct 2005 00:34:59 +0000 (GMT) Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130]) by svr1.postgresql.org (Postfix) with ESMTP id B4EF7D9233 for ; Thu, 20 Oct 2005 21:35:01 -0300 (ADT) Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id j9L0Z1Uw015865; Thu, 20 Oct 2005 20:35:01 -0400 (EDT) To: Teodor Sigaev Cc: Michael Fuhr , pgsql-docs@postgresql.org Subject: Re: Multicolumn index doc out of date? In-reply-to: <4326BBF5.4090900@sigaev.ru> References: <20050912142647.GA34685@winnie.fuhr.org> <23966.1126553464@sss.pgh.pa.us> <4326BBF5.4090900@sigaev.ru> Comments: In-reply-to Teodor Sigaev message dated "Tue, 13 Sep 2005 15:45:57 +0400" Date: Thu, 20 Oct 2005 20:35:01 -0400 Message-ID: <15864.1129854901@sss.pgh.pa.us> From: Tom Lane X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, hits=0.006 required=5 tests=[AWL=0.006] X-Spam-Level: X-Archive-Number: 200510/58 X-Sequence-Number: 3303 [ getting back to this documentation issue finally ] Teodor Sigaev writes: > I disagree with last affirmation: inner pages of index contains fair union of > keys and enough helpful to select. Mailware ( http://www.pgsql.ru/db/mw ) > sucsessfully use combined GiST index (date, tsvector) for searching. > GiST's split algorithm is good for unique leading keys, not so bad for small > number of non-unique values and bad for all equals leading key. But "bad" means > that itsn't optimal as picksplit for other keys may be. If there is several keys > which can be moved on left or right page without changing union of first key for > each page then GiST try put its on page (left or right) with smallest penalty > calculated by other keys. This algorithm is very similar to defining page to put > tuple with normal processing (without page split). > With unique leading key GiST's split is fully similar to BTree - it looks only > at leading key, but gistchoose isn't. Gistchoose (gistutil.c:622) chooses child > with smallest penalty and it looks to other keys if several leading keys has the > same penalty. In a GiST tree different keys may have the same penalty value with > new key. OK, how about this text then? A multicolumn GiST index can only be used when there is a query condition on its leading column. Conditions on additional columns restrict the entries returned by the index, but the condition on the first column is the most important one for determining how much of the index needs to be scanned. A GiST index will be relatively ineffective if its first column has only a few distinct values, even if there are many distinct values in additional columns. regards, tom lane