Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tZswJ-00E0Oo-8k for pgsql-admin@arkaria.postgresql.org; Mon, 20 Jan 2025 14:39:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tZswH-002qAm-RJ for pgsql-admin@arkaria.postgresql.org; Mon, 20 Jan 2025 14:39:42 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tZswH-002q9F-AL for pgsql-admin@lists.postgresql.org; Mon, 20 Jan 2025 14:39:42 +0000 Received: from mail-oo1-xc30.google.com ([2607:f8b0:4864:20::c30]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tZswF-000ZAs-29 for pgsql-admin@lists.postgresql.org; Mon, 20 Jan 2025 14:39:40 +0000 Received: by mail-oo1-xc30.google.com with SMTP id 006d021491bc7-5f2b21a0784so1009088eaf.1 for ; Mon, 20 Jan 2025 06:39:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737383979; x=1737988779; darn=lists.postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=K9QHkrVXmylB18IIjx6gxVBIqjFigOsKM0IOZGi7/vw=; b=cauAD4xtuC+Lii3tJGvL9y3AhCqUJzyCIWLE3yO29A9IhgOQHZET9ZCAdoYKi0bBdy RYX4G1MTEekbMsaiGh5Ujfa4ErdM49EJKTRG1gY8WyuPij0RhWwP3DlOzovy5j+3pGcC Log6CC5bK1w0UlbzCzm5Rirhasy0NjWaCeFcIyIPfcagadF1ybbXkxxAfFMBuHWT5CSs b/Yhy97eSU3rRcavnAGb6alouc0CUUX8jJnAt7STjlqK2pw0pw9xzJmHPbaRvWj0eM+5 x8jEPmrSOqx/KJtPQq8vlt/5AalZPYvg3qLIfY4vvk8A2t3f49hQusUIz179xV5Mj5GU gGlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737383979; x=1737988779; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=K9QHkrVXmylB18IIjx6gxVBIqjFigOsKM0IOZGi7/vw=; b=T26FxADKGr2HkLx9nA0u76598Qv1p9A6thssvoTG1r7cUJBOEp4DU4OTCpZaieLaI9 xV/MEj10iTekyWS0PUelGLqCpi8WhcQMFgv1wgIU/yIXqxKDMkgvvpzKn4nWy1/Tj6ws YydqqThBMimdN1yISA3cPX3Jue20R3njsg0M0QsR1MBDyiAymQ+YEePfK+dl9jIu8U3F 5c7ZXYlrKKToE8tqtLWY6YnqDFovBQE4ig7YzzJotu6flC3sa6j7V6FIXns3QqbIPSie L3Mn6R0uTGKk0UouUTLi/HweHcjhaCXn/XYKmpe3vTcklwz986CVI7OIMf4gLI3zdC7C YL2Q== X-Gm-Message-State: AOJu0YygmYrokxhYhdDwuXyeywVOPP3qQ0vPgnUobfDBe1XBQQ7NUxMh pE6WKWsTQj3DSyIsVyRzVqPdzraYT5jGYfZuXjZaNGiCNEctZ/M9BRNij5AT+KMgl1xWIvH1YXj Hf3ytAKNHbrgFz4MsGgOoCRVh5+YN9HeW X-Gm-Gg: ASbGncvZxHxhnkAp5z34gpeixGmgcqxZGYy3O0ydtkXNKyqAR4ZEYPrv6mvjiqlrvvH MwzRPKVZoJdbJKlRyHoc7/M4G7rLyHrEs6SssyyIVtv44vbtp2DkewOO/lGIo5B4isO6X6HplX6 5H3P8/0ih0yw== X-Google-Smtp-Source: AGHT+IE0WWTu30KvIVtd3qvPUshvaBuX5HdR1tKlEWHJoEX37Fw80ZIW9xfDhVTzY2Y0t58a/IX83u2keTJSDZGd8MY= X-Received: by 2002:a05:6870:e18:b0:29e:5de2:cffb with SMTP id 586e51a60fabf-2b1c09271acmr7085893fac.17.1737383979291; Mon, 20 Jan 2025 06:39:39 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Ron Johnson Date: Mon, 20 Jan 2025 09:39:28 -0500 X-Gm-Features: AbW1kvYWlk-1RgeL5gdvlvMUWwcpIzQfJg_ravUCuZEa2dh0DAKX2AVfQTpU2NA Message-ID: Subject: Re: Performance issue - Seq Scan To: Pgsql-admin Content-Type: multipart/alternative; boundary="000000000000ec9a7f062c243a21" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000ec9a7f062c243a21 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable 500M rows doesn't have to be a lot of records. Are the tuples large? If I were to partition those tables, I would partition them on an existing PK field. Until then, I would: - disable AUTOVACUUM on those tables immediately before the ETL job starts - run the ETL job - "manually" run VACUUM ANALYZE on those tables. - enable AUTOVACUUM on those tables On Mon, Jan 20, 2025 at 6:07=E2=80=AFAM Siraj G wrote= : > Hello Guillaume! > > As I highlighted the records count for these tables which are quite high, > would it be a best practice if we change the vacuum and analyze scale > factor at the table level? > Also, I am trying to understand if partitioning is required for these > tables, or at least for the one which has over 500million records? > > Regards > Siraj > > On Mon, Jan 20, 2025 at 3:04=E2=80=AFPM Guillaume Lelarge > wrote: > >> Hi, >> >> Le lun. 20 janv. 2025 =C3=A0 09:42, Siraj G a =C3= =A9crit : >> >>> Hello Experts! >>> >>> We had a performance issue with a SQL that used to complete in a few >>> milliseconds, was taking over 14seconds. We had to run *analyze *on 3 >>> tables to get the idle performance back. >>> >>> When the performance was not optimal, we noticed sequential scans even >>> with indexes created. >>> >>> The tables and their count: >>> coverage_details =3D 529628595 >>> customer_details =3D 81721669 >>> policy_details =3D 116909729 >>> >>> PgSQL version is: >>> PostgreSQL 15.7 on x86_64-pc-linux-gnu, compiled by Debian clang versio= n >>> 12.0.1, 64-bit >>> >>> One more information is that we noticed this started happening (in the >>> destination) after an ETL job completed the load (regular load). *Just >>> wanted to know if any follow up actions we should do after such data lo= ads, >>> eg., analyze or vacuum. *We do have autovacuum on, with default values. >>> >>> >> Yes, you should run "VACUUM ANALYZE" after running a batch. autovacuum >> could be not fast enough to do it itself before you start querying the n= ew >> data. >> >> >> -- >> Guillaume. >> > --=20 Death to , and butter sauce. Don't boil me, I'm still alive. lobster! --000000000000ec9a7f062c243a21 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
500M rows doesn't have to be a lot of records.=C2= =A0 Are the tuples large?=C2=A0 If I were to partition those tables, I woul= d partition them on an existing PK field.

Until th= en, I would:
  • disable AUTOVACUUM on those=C2=A0tables imme= diately before the ETL job starts
  • run the ETL job
  • "man= ually" run VACUUM ANALYZE on those tables.
  • enable=C2=A0 AUTOVACUUM on those=C2=A0tables

On Mon, Jan = 20, 2025 at 6:07=E2=80=AFAM Siraj G <tosiraj.g@gmail.com> wrote:
Hello Guillaume!

As I highlighted the records count for these tables which are quite high, = would it be a best practice if we change the vacuum and analyze scale facto= r at the table level?
Also, I am trying to understand if partitio= ning is required for these tables, or at least for the one which has over 5= 00million records?

Regards
Siraj

On= Mon, Jan 20, 2025 at 3:04=E2=80=AFPM Guillaume Lelarge <guillaume@lelarge.info>= wrote:
Hi,

Le=C2=A0lun. 20 janv. 2025 =C3=A0=C2=A009:= 42, Siraj G <to= siraj.g@gmail.com> a =C3=A9crit=C2=A0:
Hello Experts!

<= /div>
We had a performance issue with a SQL that used to complete in a = few milliseconds, was taking over 14seconds. We had to run analyze o= n 3 tables to get the idle performance back.=C2=A0

When the performance was not optimal, we noticed sequential scans even wit= h indexes created.

The tables and their count:
coverage_details =3D 529628595
customer_details =3D 81721669
p= olicy_details =3D 116909729

PgSQL version is:
PostgreSQL 15.7 on x86_64-pc-linux-gnu, compiled by Debian clang vers= ion 12.0.1, 64-bit

One more information is that we= noticed this started happening (in the destination) after an ETL job compl= eted the load (regular load). Just wanted to know if any follow up actio= ns we should do after such data loads, eg., analyze or vacuum. We do ha= ve autovacuum on, with default values.


Yes, you should run "VACUUM ANALYZE" after= running a batch. autovacuum could be not fast enough to do it itself befor= e you start querying the new data.


--
Gu= illaume.


--
Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.
<Redacted> lobs= ter!
--000000000000ec9a7f062c243a21--