MIME-Version: 1.0
From: Marcin Barczynski <mbarczynski@starfishstorage.com>
Date: Wed, 6 Sep 2017 13:57:25 +0200
Message-ID: 
 <CAOhG4wem20oFCyZhGW-WyqJcno45z2trPnvsDrDUcPbOytFycg@mail.gmail.com>
Subject: Slow vacuum of GIST indexes,
 because of random reads on PostgreSQL 9.6
To: pgsql-performance@postgresql.org
Content-Type: multipart/alternative; boundary="001a113d24261b45d70558840c10"
Precedence: bulk
Sender: pgsql-performance-owner@postgresql.org

--001a113d24261b45d70558840c10
Content-Type: text/plain; charset="UTF-8"

I am using a GIST index on timestamp range, because it supports 'contains'
operator ('@>'). Unfortunately, in large scale (billions of rows, index
size: almost 800 GB) vacuuming the index takes an order of magnitude longer
than btrees (days/weeks instead of hours).
According to the code, during vacuum gist index is traversed in a logical
order which translates into random disk acceses (function gistbulkdelete in
gistvacuum.c). Btree indexes are vacuummed in physical order (function
btvacuumscan in nbtree.c).

As a workaround, I'm planning to replace all uses of 'contains' with the
following function:

    CREATE OR REPLACE FUNCTION tstzrange_contains(
        range tstzrange,
        ts timestamptz)
    RETURNS bool AS
    $$
    SELECT (ts >= lower(range) AND (lower_inc(range) OR ts > lower(range)))
       AND (ts <= upper(range) AND (upper_inc(range) OR ts < upper(range)))
    $$ LANGUAGE SQL IMMUTABLE;

and create btree indexes on lower and upper bound:

    CREATE INDEX my_table_time_range_lower_idx ON my_table
(lower(time_range));
    CREATE INDEX my_table_time_range_upper_idx ON my_table
(upper(time_range));

Is it the best approach?

-- 
Best regards,
Marcin Barczynski

--001a113d24261b45d70558840c10
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>I am using a GIST index on timestamp range, because i=
t supports &#39;contains&#39; operator (&#39;@&gt;&#39;). Unfortunately, in=
 large scale (billions of rows, index size: almost 800 GB) vacuuming the in=
dex takes an order of magnitude longer than btrees (days/weeks instead of h=
ours).=C2=A0</div><div>According to the code, during vacuum gist index is t=
raversed in a logical order which translates into random disk acceses (func=
tion gistbulkdelete in gistvacuum.c). Btree indexes are vacuummed in physic=
al order (function btvacuumscan in nbtree.c).</div><div><br></div><div>As a=
 workaround, I&#39;m planning to replace all uses of &#39;contains&#39; wit=
h the following function:</div><div><br></div><div>=C2=A0 =C2=A0 CREATE OR =
REPLACE FUNCTION tstzrange_contains(</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
range tstzrange,</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 ts timestamptz)</div=
><div>=C2=A0 =C2=A0 RETURNS bool AS</div><div>=C2=A0 =C2=A0 $$</div><div>=
=C2=A0 =C2=A0 SELECT (ts &gt;=3D lower(range) AND (lower_inc(range) OR ts &=
gt; lower(range)))</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0AND (ts &lt;=3D upp=
er(range) AND (upper_inc(range) OR ts &lt; upper(range)))</div><div>=C2=A0 =
=C2=A0 $$ LANGUAGE SQL IMMUTABLE;</div><div><br></div><div>and create btree=
 indexes on lower and upper bound:</div><div><br></div><div>=C2=A0 =C2=A0 C=
REATE INDEX my_table_time_range_lower_idx ON my_table (lower(time_range));<=
/div><div>=C2=A0 =C2=A0 CREATE INDEX my_table_time_range_upper_idx ON my_ta=
ble (upper(time_range));</div><div><br></div><div>Is it the best approach?<=
/div><div><br></div>-- <br><div class=3D"gmail_signature"><div dir=3D"ltr">=
<div><div dir=3D"ltr"><div>Best regards,</div><div>Marcin Barczynski</div><=
div><br></div></div></div></div></div>
</div>

--001a113d24261b45d70558840c10--