Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dpYxe-00019p-Ve for pgsql-performance@arkaria.postgresql.org; Wed, 06 Sep 2017 11:57:39 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1dpYxe-0008Kr-IF for pgsql-performance@arkaria.postgresql.org; Wed, 06 Sep 2017 11:57:38 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dpYxW-0007lO-AF for pgsql-performance@postgresql.org; Wed, 06 Sep 2017 11:57:30 +0000 Received: from mail-oi0-x22b.google.com ([2607:f8b0:4003:c06::22b]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dpYxT-0006pz-CG for pgsql-performance@postgresql.org; Wed, 06 Sep 2017 11:57:29 +0000 Received: by mail-oi0-x22b.google.com with SMTP id x190so24509036oix.3 for ; Wed, 06 Sep 2017 04:57:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=starfishstorage-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=Qf/vrCKH2uFFoYp60UEoRUj9rcNgyJ9TswK2prn7QKg=; b=ZQ9xZuvG5q4OFyPPw8yOF2Y/SVdYaSPFTmQ+mQe9RwgJk3bRICBFH0meww1+x2QpNH 5Vz3t0alHK3zeVmT4f20UOBnt+JE9NyQStzGnmjHcTdFgd+NxQvIL/JgMhnogMXkHV/Y hC4RP1FOpo5Lw4c9ZlYESJGJ6d4wWo1+v/77Wx36ZUiiIXgTqe9admkfHUV1TX99Cnq9 PwEBMi89IwLLUl3c0eoz+7OGPaDO5k3lCtk4FVRzN+9jBu05JQ6zzvpXLvifq1lRWP1H uOru7+W1W3V0NpSAE12UkKrjZuaUeWdITQZXmXjBcMjuD2B3J+wi9Mm3zbtHXtHKZhIA Rqlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=Qf/vrCKH2uFFoYp60UEoRUj9rcNgyJ9TswK2prn7QKg=; b=n3Ilr/p+vK5ETbGCNDbDwstUyCnS2VrPKnePO0ET3k4azO6f32rBeNJkZZG1O3N8iX phKRTjhsr5vkhHSzxnkOsN2UMlbdqItdNmLGf1hEP2pPQNxGrZHHQOax2NzE/I7v++/W Lq19oEYPHrtDiVIVgF/TVjbcwc6yuSvZ2+2PU0mR+u284uzEt58F4MiAAbO0W9Xx75Pw p+DxAChuAgIhbc7rK/v5jDqsJLffQDsDOcu+cpRnYNQUrq5ciLUqy/2RvImI3L1am8Um st3UdJBiOFUhxDUvatDor5yTPDxdUB/VmBH40W9+J+sG1pqueiaMmXTaJfGkGo/Agh6G IZ5Q== X-Gm-Message-State: AHPjjUhmONwF6awcfHwNA1fyvBBhWeYyZKlb/eLPJ5cMFp2MwU+wylGB UZNiL2V3SQklNRk2XfevdVDOAcfqNViIHdRlwQ== X-Google-Smtp-Source: ADKCNb4tlFS/2Gs2+9pKvk2JriwtCoc2uKdSML0yc1GNlN9dLjOwN+e9wZia2XMmCswFl20TDpo4YwxmzkSOstdqP6o= X-Received: by 10.202.206.12 with SMTP id e12mr2558030oig.144.1504699045587; Wed, 06 Sep 2017 04:57:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.74.66.69 with HTTP; Wed, 6 Sep 2017 04:57:25 -0700 (PDT) From: Marcin Barczynski Date: Wed, 6 Sep 2017 13:57:25 +0200 Message-ID: Subject: Slow vacuum of GIST indexes, because of random reads on PostgreSQL 9.6 To: pgsql-performance@postgresql.org Content-Type: multipart/alternative; boundary="001a113d24261b45d70558840c10" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a113d24261b45d70558840c10 Content-Type: text/plain; charset="UTF-8" I am using a GIST index on timestamp range, because it supports 'contains' operator ('@>'). Unfortunately, in large scale (billions of rows, index size: almost 800 GB) vacuuming the index takes an order of magnitude longer than btrees (days/weeks instead of hours). According to the code, during vacuum gist index is traversed in a logical order which translates into random disk acceses (function gistbulkdelete in gistvacuum.c). Btree indexes are vacuummed in physical order (function btvacuumscan in nbtree.c). As a workaround, I'm planning to replace all uses of 'contains' with the following function: CREATE OR REPLACE FUNCTION tstzrange_contains( range tstzrange, ts timestamptz) RETURNS bool AS $$ SELECT (ts >= lower(range) AND (lower_inc(range) OR ts > lower(range))) AND (ts <= upper(range) AND (upper_inc(range) OR ts < upper(range))) $$ LANGUAGE SQL IMMUTABLE; and create btree indexes on lower and upper bound: CREATE INDEX my_table_time_range_lower_idx ON my_table (lower(time_range)); CREATE INDEX my_table_time_range_upper_idx ON my_table (upper(time_range)); Is it the best approach? -- Best regards, Marcin Barczynski --001a113d24261b45d70558840c10 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I am using a GIST index on timestamp range, because i= t supports 'contains' operator ('@>'). Unfortunately, in= large scale (billions of rows, index size: almost 800 GB) vacuuming the in= dex takes an order of magnitude longer than btrees (days/weeks instead of h= ours).=C2=A0
According to the code, during vacuum gist index is t= raversed in a logical order which translates into random disk acceses (func= tion gistbulkdelete in gistvacuum.c). Btree indexes are vacuummed in physic= al order (function btvacuumscan in nbtree.c).

As a= workaround, I'm planning to replace all uses of 'contains' wit= h the following function:

=C2=A0 =C2=A0 CREATE OR = REPLACE FUNCTION tstzrange_contains(
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = range tstzrange,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ts timestamptz)
=C2=A0 =C2=A0 RETURNS bool AS
=C2=A0 =C2=A0 $$
= =C2=A0 =C2=A0 SELECT (ts >=3D lower(range) AND (lower_inc(range) OR ts &= gt; lower(range)))
=C2=A0 =C2=A0 =C2=A0 =C2=A0AND (ts <=3D upp= er(range) AND (upper_inc(range) OR ts < upper(range)))
=C2=A0 = =C2=A0 $$ LANGUAGE SQL IMMUTABLE;

and create btree= indexes on lower and upper bound:

=C2=A0 =C2=A0 C= REATE INDEX my_table_time_range_lower_idx ON my_table (lower(time_range));<= /div>
=C2=A0 =C2=A0 CREATE INDEX my_table_time_range_upper_idx ON my_ta= ble (upper(time_range));

Is it the best approach?<= /div>

--
=
Best regards,
Marcin Barczynski
<= div>
--001a113d24261b45d70558840c10--