Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vWJjc-00GxH3-0P for pgsql-general@arkaria.postgresql.org; Thu, 18 Dec 2025 19:32:25 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vWJjZ-003lBp-2G for pgsql-general@arkaria.postgresql.org; Thu, 18 Dec 2025 19:32:22 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vWJjZ-003lBg-0Z for pgsql-general@lists.postgresql.org; Thu, 18 Dec 2025 19:32:22 +0000 Received: from mail-oi1-x234.google.com ([2607:f8b0:4864:20::234]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vWJjX-001NjK-3D for pgsql-general@postgresql.org; Thu, 18 Dec 2025 19:32:20 +0000 Received: by mail-oi1-x234.google.com with SMTP id 5614622812f47-44fe6771b2dso276506b6e.1 for ; Thu, 18 Dec 2025 11:32:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766086339; x=1766691139; darn=postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=1HumFMxhpQufef/BP90IqWZwhfpRiHgJ2yrdg241g0I=; b=FS6JZEq/d5++kq+Bg/XQNiwOU8krbNCEquZNYKysea3Oeh7NRgJK4pHrZ1KPkwI0s3 pqR0eOodQmYld+T4YvV6UTqMpBaGEwRrEK99Lesqf7e/z+JvYIzqETf4UXj0+wo1tU70 845ICVLFfwMHciWBNZB8RTrhgV7zudoqSRYSjVmPnDqBL+iwq+OXJdO2pgIPIAfnj1zy 7NJvZk8Jk3eyzY9jsevKwjM55ICYyxDDxg5IrpgOxKL2oyRALJMk5v/GlRUD5ZPapxle P+XAZsUdS1gbEifbGuduqkQ4hJX9DWh8Er+VMtU/5yArtR7mLoSPvEN2O1ReYvSkjl92 85zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766086339; x=1766691139; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1HumFMxhpQufef/BP90IqWZwhfpRiHgJ2yrdg241g0I=; b=PpBlOWaeeQ+JhAS6ao5kOQzin8+nJSw/momoS65XWr3bmD4MarJvxOd1nx6EGA2L+F lJV3ZKuU1VQT2Hs+N3qLVNeuWLxb6gT/wyDxpc7j2TeCBw1meQjyPoDfbh7+Z/DGLYNA ptRAX+z7ADoQxH7CxWoiYZC7z3hOgV7tuCfjE5AHFB14MtXxiv7EpSXc70m+fovx3oRN /+qdxIGxOtAtmak+/puwT3jkPFDEDGRlvuBalzXq/8UwPzhtORknARgUd2tZRPgUK35C bRZZGGfYhLKA0N0M16L/O14Aa/CU84dfk1BTXzHzBBT/baa8u/ZdKKbDkas1+WiFh0/c HzPQ== X-Gm-Message-State: AOJu0YxK6HEhQcpOj5nfdbJRIEhD16ASfmMQjyZk1dhGkkrSVZ2lHn7D 2sh58aZ46KqA0ndoqZo4vXDcRzl4ILe3IXXGOF/RyHypdAYtDF+lhQMGprIx0hngK3WVa9a6xWl xDpseVKiweXzP7UUZyKYo/ORl3Upn73r+cEVu X-Gm-Gg: AY/fxX5LHevE5KQFSCocR4a4HmNGTahq77uiuZ69UDborRGUcaBPPd58SaLtax0OiPB EY+Kz6DFRi7H5iteEFPuxuP7UQqhSbIsAqkT9A8TJYWo8Wo9MbR2SECVijh+Sf6cSIz9ffcG7r1 xVlxFVOvsWr9zbTNepUDeLiiotVzxy4zWzp0kcg8l23Aq1eXam6mJCKCOBtnfMvqhipADvSnsrw Y8myCHt4vk2Z8f2yFhoCjNRjalgGsEhnMapYDyBt4mB136NuKWCvO2pxm6omeGzc+Br7KIQ X-Google-Smtp-Source: AGHT+IHoo1NDDzYSsIoi8zmq7JLR3bu4yRmte/q7ob/XHDZqzRVI8eRJ+O7c9Sd5VU+2MABzqUJXmqHfQHvt52+RtmU= X-Received: by 2002:a05:6808:199f:b0:450:760b:cc9e with SMTP id 5614622812f47-457b20c2726mr362943b6e.27.1766086338592; Thu, 18 Dec 2025 11:32:18 -0800 (PST) MIME-Version: 1.0 References: <87bjjv1v96.fsf@gmail.com> In-Reply-To: <87bjjv1v96.fsf@gmail.com> From: Ron Johnson Date: Thu, 18 Dec 2025 14:32:07 -0500 X-Gm-Features: AQt7F2p7lQ6C3R0uN_l-ZW4WhWg5ToZlAx35MTsp19_I80VEZvDbwnCpsHLeeyg Message-ID: Subject: Re: Dealing with SeqScans when Time-based Partitions Cut Over To: pgsql-general Content-Type: multipart/alternative; boundary="000000000000dacbc606463f0438" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000dacbc606463f0438 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Dec 18, 2025 at 1:48=E2=80=AFPM Matthew Planchard wrote: > > In a table with high insert frequency (~1.5k rows/s) and high query > frequency (~1k queries/s), partitioned by record creation time, we have > observed the following behavior: > > * When the current time crosses a partition boundary, all new records > are written to the new partition, which was previously empty, as > expected > > * Because the planner's latest knowledge of the partition was based on > its state prior to the cutover, it assumes the partition is empty and > creates plans that use sequential scans > > * The table accumulates tens to hundreds of thousands of rows, and the > sequentail scans start to use nearly 100% of available database CPU > > * Eventually the planner updates thee stats and all is well, but the > cycle repeats the next time the partitions cut over. > > We have tried setting up a cron job that runs ANALYZE on the most recent > partition of the table every 15 seconds at the start of the hour, and > while this does help in reducing the magnitude and duration of the > problem, it is insufficient to fully resolve it (our engineers are still > getting daily pages for high DB CPU utilization). > What's autovacuum_analyze_scale_factor set to? The default 20% is pretty high. autovacuum_naptime might need to be dropped, too. And maybe have the shell script that the cron job runs sleep only 5 seconds in the ANALY loop. > We have considered maintaining a separate connection pool with > connections that have `enable_seqscan` set to `off`, and updating the > application to use that pool for these queries, but I was hoping the > community might have some better suggestions. > How about just force seqscan off when the table is created? ALTER TABLE SET (enable_seqscan =3D off); --=20 Death to , and butter sauce. Don't boil me, I'm still alive. lobster! --000000000000dacbc606463f0438 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, Dec 18, 2025 at 1:48=E2=80=AFPM M= atthew Planchard <msplanchard@g= mail.com> wrote:

In a table with high insert frequency (~1.5k rows/s) and high query
frequency (~1k queries/s), partitioned by record creation time, we have
observed the following behavior:

* When the current time crosses a partition boundary, all new records
=C2=A0 are written to the new partition, which was previously empty, as
=C2=A0 expected

* Because the planner's latest knowledge of the partition was based on<= br> =C2=A0 its state prior to the cutover, it assumes the partition is empty an= d
=C2=A0 creates plans that use sequential scans

* The table accumulates tens to hundreds of thousands of rows, and the
=C2=A0 sequentail scans start to use nearly 100% of available database CPU<= br>
* Eventually the planner updates thee stats and all is well, but the
=C2=A0 cycle repeats the next time the partitions cut over.

We have tried setting up a cron job that runs ANALYZE on the most recent partition of the table every 15 seconds at the start of the hour, and
while this does help in reducing the magnitude and duration of the
problem, it is insufficient to fully resolve it (our engineers are still getting daily pages for high DB CPU utilization).

What's=C2=A0autovacuum_analyz= e_scale_factor set to?=C2=A0 =C2=A0The default 20% is pretty high.
autovacuum_naptime might need to be dropped, too.

And maybe have the shell script that the cron job runs sleep only 5 secon= ds in the ANALY loop.
=C2=A0
We have considered maintaining a separate connection poo= l with
connections that have `enable_seqscan` set to `off`, and updating the
application to use that pool for these queries, but I was hoping the
community might have some better suggestions.

How about just force seqscan=C2=A0off when the table is created?
ALTER TABLE <table_partition> SET (enable_seqscan=C2=A0 =3D of= f);

--
Death t= o <Redacted>, and butter sauce.
Don't boil me, I'm still = alive.
<Redacted> lobster!
=
--000000000000dacbc606463f0438--