public inbox for [email protected]
help / color / mirror / Atom feedFrom: veem v <[email protected]>
To: Lok P <[email protected]>
Cc: pgsql-general <[email protected]>
Subject: Re: How to create efficient index in this scenario?
Date: Sun, 9 Jun 2024 10:21:55 +0530
Message-ID: <CAB+=1TUghHyWXDhEqeWhzWRgWJPy44pr7VkhX+v_-nCph6GWgA@mail.gmail.com> (raw)
In-Reply-To: <CAKna9VajLFW=9Z1Y9ar0WJXKeGTgYXivFtBmdt=gXJoLs4s2Rw@mail.gmail.com>
References: <CAB+=1TX+Av1Fx+Q4YOmUGioUoa8TQ8kGa1h06zPSEona2az39A@mail.gmail.com>
<CAKna9VajLFW=9Z1Y9ar0WJXKeGTgYXivFtBmdt=gXJoLs4s2Rw@mail.gmail.com>
On Sun, 9 Jun 2024 at 09:45, Lok P <[email protected]> wrote:
>
> On Sat, Jun 8, 2024 at 7:03 PM veem v <[email protected]> wrote:
>
>>
>> There is a blog below (which is for oracle), showing how the index should
>> be chosen and it states , "*Stick the columns you do range scans on
>> last in the index, filters that get equality predicates should come first.*
>> ", and in that case we should have the PK created as in the order
>> (transaction_id,transaction_timestamp). It's because making the range
>> predicate as a leading column won't help use that as an access predicate
>> but as an filter predicate thus will read more blocks and thus more IO.
>> Does this hold true in postgres too?
>>
>>
>> https://ctandrewsayer.wordpress.com/2017/03/24/the-golden-rule-of-indexing/
>>
>
> I believe the analogy holds true here in postgres too and the index in
> this case should be on (transaction_id, transaction_timestamp).
>
>
>>
>>
>> Additionally there is another scenario in which we have the requirement
>> to have another timestamp column (say create_timestamp) to be added as part
>> of the primary key along with transaction_id and we are going to query this
>> table frequently by the column create_timestamp as a range predicate. And
>> ofcourse we will also have the range predicate filter on partition key
>> "transaction_timestamp". But we may or may not have join/filter on column
>> transaction_id, so in this scenario we should go for
>> (create_timestamp,transaction_id,transaction_timestamp). because
>> "transaction_timestamp" is set as partition key , so putting it last
>> doesn't harm us. Will this be the correct order or any other index order is
>> appropriate?
>>
>>
>>
> In this case , the index should be on (
> create_timestamp,transaction_id,transaction_timestamp), considering the
> fact that you will always have queries with "create_timestamp" as predicate
> and may not have transaction_id in the query predicate.
>
So in the second scenario, if we keep the create_timestamp as the leading
column ,is it not against the advice which the blog provides i.e. to not
have the range predicate as the leading column in the index?
view thread (2+ messages)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: How to create efficient index in this scenario?
In-Reply-To: <CAB+=1TUghHyWXDhEqeWhzWRgWJPy44pr7VkhX+v_-nCph6GWgA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox