public inbox for [email protected]  
help / color / mirror / Atom feed
From: Teodor Sigaev <[email protected]>
To: Tom Lane <[email protected]>
Cc: Michael Fuhr <[email protected]>
Cc: [email protected]
Cc: Oleg Bartunov <[email protected]>
Subject: Re: Multicolumn index doc out of date?
Date: Fri, 21 Oct 2005 13:57:12 +0400
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>



Tom Lane wrote:
> [ getting back to this documentation issue finally ]
> 
> Teodor Sigaev <[email protected]> writes:
> 
>>I disagree with last affirmation: inner pages of index contains fair union of 
>>keys and enough helpful to select. Mailware ( http://www.pgsql.ru/db/mw ) 
>>sucsessfully use combined GiST index (date, tsvector) for searching.
> 
> 
>>GiST's split algorithm is good for unique leading keys, not so bad for small 
>>number of non-unique values and bad for all equals leading key. But "bad" means 
>>that itsn't optimal as picksplit for other keys may be. If there is several keys 
>>which can be moved on left or right page without changing union of first key for 
>>each page then GiST try put its on page (left or right) with smallest penalty 
>>calculated by other keys. This algorithm is very similar to defining page to put 
>>tuple with normal processing (without page split).
> 
> 
>>With unique leading key GiST's split is fully similar to BTree - it looks only 
>>at leading key, but gistchoose isn't. Gistchoose (gistutil.c:622) chooses child 
>>with smallest penalty and it looks to other keys if several leading keys has the 
>>same penalty. In a GiST tree different keys may have the same penalty value with 
>>new key.
> 
> 
> OK, how about this text then?
> 
>    A multicolumn GiST index can only be used when there is a query condition
>    on its leading column.  Conditions on additional columns restrict the
>    entries returned by the index, but the condition on the first column is the
>    most important one for determining how much of the index needs to be
>    scanned.  A GiST index will be relatively ineffective if its first column
>    has only a few distinct values, even if there are many distinct values in
>    additional columns.

Ok, I think.

-- 
Teodor Sigaev                                   E-mail: [email protected]
                                                    WWW: http://www.sigaev.ru/



view thread (5+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Multicolumn index doc out of date?
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox