public inbox for [email protected]  
help / color / mirror / Atom feed
From: Marcin Barczynski <[email protected]>
To: [email protected]
Subject: Slow vacuum of GIST indexes, because of random reads on PostgreSQL 9.6
Date: Wed, 6 Sep 2017 13:57:25 +0200
Message-ID: <CAOhG4wem20oFCyZhGW-WyqJcno45z2trPnvsDrDUcPbOytFycg@mail.gmail.com> (raw)
List-Unsubscribe:  <mailto:[email protected]?body=unsub%20pgsql-performance>

I am using a GIST index on timestamp range, because it supports 'contains'
operator ('@>'). Unfortunately, in large scale (billions of rows, index
size: almost 800 GB) vacuuming the index takes an order of magnitude longer
than btrees (days/weeks instead of hours).
According to the code, during vacuum gist index is traversed in a logical
order which translates into random disk acceses (function gistbulkdelete in
gistvacuum.c). Btree indexes are vacuummed in physical order (function
btvacuumscan in nbtree.c).

As a workaround, I'm planning to replace all uses of 'contains' with the
following function:

    CREATE OR REPLACE FUNCTION tstzrange_contains(
        range tstzrange,
        ts timestamptz)
    RETURNS bool AS
    $$
    SELECT (ts >= lower(range) AND (lower_inc(range) OR ts > lower(range)))
       AND (ts <= upper(range) AND (upper_inc(range) OR ts < upper(range)))
    $$ LANGUAGE SQL IMMUTABLE;

and create btree indexes on lower and upper bound:

    CREATE INDEX my_table_time_range_lower_idx ON my_table
(lower(time_range));
    CREATE INDEX my_table_time_range_upper_idx ON my_table
(upper(time_range));

Is it the best approach?

-- 
Best regards,
Marcin Barczynski


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: Slow vacuum of GIST indexes, because of random reads on PostgreSQL 9.6
  In-Reply-To: <CAOhG4wem20oFCyZhGW-WyqJcno45z2trPnvsDrDUcPbOytFycg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox