Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDlip-00FVxo-PU for pgsql-general@arkaria.postgresql.org; Tue, 28 Oct 2025 15:34:55 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vDlio-00DnQj-Gf for pgsql-general@arkaria.postgresql.org; Tue, 28 Oct 2025 15:34:53 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDlfY-00Dif7-Ko for pgsql-general@lists.postgresql.org; Tue, 28 Oct 2025 15:31:31 +0000 Received: from mail-il1-x12a.google.com ([2607:f8b0:4864:20::12a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vDlfV-004ikF-0L for pgsql-general@lists.postgresql.org; Tue, 28 Oct 2025 15:31:31 +0000 Received: by mail-il1-x12a.google.com with SMTP id e9e14a558f8ab-430c17e29d9so25660715ab.1 for ; Tue, 28 Oct 2025 08:31:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761665487; x=1762270287; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=nLr9Ym/g5alc4l3CB8LHv0NHaBmFDKs3wFDh9Q3Kg/E=; b=R3lH0oDjG8ZoLUirEVg5gRrGnTaetvf1+j50b793aDts7gqp8JaTIkEHTEFyIhRHS5 XynMjoA/mTjB4zmlg/Q08/436NocnolXe8W9N51cygkU66xaHq3IIO0v4dwAPkqY7MQ7 an8IKy7BEjm900DrtOATj65Fih27/FjtDzgf2On0cgHI9Cy+zUNMpURv0Sn2c/RCBI/O TQN1iMFx52FwsOG69bNxpptv6rziGt8rrUdc+514OTH+H+j7knJBYP2inMfnYDk0UxzH znIEonUTYDQPB/lXnvs5j7KUSdwkNyGbHaZhO7VL+D1V1Gzd/kLZbfiIkj8rk8y4k+Sk zgjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761665487; x=1762270287; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nLr9Ym/g5alc4l3CB8LHv0NHaBmFDKs3wFDh9Q3Kg/E=; b=TICeiJn2gbII9PLo5qlH/iJQ71l7r8oW+EZnof7Lf671WvdfFsWIjKjR2qV91jSYBw XMgFnPT14ITHLDA0RbMkgLs5On6g2i5a2KuQsSYua0WQ4qhOBFzrUe2BFWptK+O5wSm2 zStvDZE+CUHXm6QAQpz4wy+u8gDsx0PEUDGcZfWetaQ7umv7OqQ+64QgcnhDCV+9fQEK Su53J0hGEPOSM8kyzQQrNOAiqN2nA0UIOfgI+6v96TIhu6cBNpQ5Kp74eehKf5vQMta4 EDvvKNZmoFuAlG3Q8/9vcHnZXk5rFRQP/yqnpH8G5ghFudU3/68Qc5bMwT8hPcogfLIV rJtA== X-Gm-Message-State: AOJu0YxQoJGT7XjO3C2UOXBPxCHwS+i/eKucdxrTz3/cQykpZC/OqdCn LWdBR6X04m4S1KaufdmQfda1UD5ydZi1Yl3gcm1v47ZIxb8+gypJja9ZeFlP0D3ZhngUGnblINM 63+r65eReJPPJpXvEmdB9yxWsGPZjYOLBXsKlu8I= X-Gm-Gg: ASbGncvv91+oy0sq1Z7tGpxZ3x5RWUN+JGSvE6UyAloLeWHUiM/yX9SJrCjln6BJKgR L8jLN0obzjVDCysoptyUwOjQk+GyWRjO1HpJNL0D08BEUtSLYklK5sZG7y0XSQ97RaA+0V0FZlb TCuQY44KP8XPaH0yikSz8SP6AtnZn+slkBTj19DgAQBc2NJ38tddEzoHuT4FGd4GKj32C4b4XKX J0of9Kh/TXHrmDnwnHBJnM3a+dHHHU1JdhZMiJUjr4S1U37NwYXPjkLNsQ9qNpv+LkTgGt6MY7+ IK+JcmkeHAos5YxLpg== X-Google-Smtp-Source: AGHT+IHLE4RhtZAEqGi1sYx2yi+YnOsy2AllCKytPzu2LZ16BwQGtf7OSWPydBcXy4jMAGTvPys70PyqCWbdtd0uht8= X-Received: by 2002:a05:6e02:3188:b0:430:a38c:b767 with SMTP id e9e14a558f8ab-432f82ef7bfmr1855285ab.18.1761665486381; Tue, 28 Oct 2025 08:31:26 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Greg Sabino Mullane Date: Tue, 28 Oct 2025 11:30:50 -0400 X-Gm-Features: AWmQ_bkIue8OmvOde42DPO62SZ7_KTZcfS3gMMmSW6I6WPzNMD5OiQvXM9HOGNc Message-ID: Subject: Re: Query planner instability on large append-only table with heavy same-day inserts To: Giriraj Sharma Cc: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000008764e5064239b5dd" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000008764e5064239b5dd Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Oct 28, 2025 at 2:36=E2=80=AFAM Giriraj Sharma wrote: > Would daily range partitioning by effective_date (=E2=89=88 1000 partitio= ns for 3 > years) be the right long-term approach here? > If you are querying based on dates, yes. Does not need to be daily, could do monthly for example. A retention policy of 3 years is a great candidate for partitioning. > Given that inserts and queries almost always target today=E2=80=99s parti= tion, > will partition pruning and per-partition statistics fully eliminate the > stale-statistics problem? Are there known performance implications of > maintaining ~1000 daily partitions at this scale (10 M/day)? We > occasionally receive backdated or future-dated inserts =E2=80=94 can such > out-of-range values be handled efficiently (e.g., by creating partitions = on > the fly)? > It will help, don't know about eliminate. I would not do daily unless it is really needed, that's a lot of partitions. Monthly to start with. You cannot create partitions on the fly, but you can have them go to a default partition and sort them out later. Or pre-create a bunch of partitions. If we stay with a single table, what are practical per-table autovacuum > settings to ensure frequent ANALYZE even as total row count grows into > billions? > Turn off autovacuum for that table. Use cron to run vacuum and analyze hourly (or some frequency that ensures good plans). If analyze is taking too long, it can be done per-column as well, although that won't help much if your sample table is representative. But if you have a lot of other columns with stable values, you could analyze those less often. Measure and see. Should we also experiment with raising the statistics target for > effective_date from 100 to 1000 using: > I don't think that will matter if your analyze is already giving you index-only scans. Are there best practices or proven patterns for append-only, > time-series=E2=80=93like workloads that insert heavily into one day and r= ead from > the same day? > Partitioning. Ensure fillfactor is 100%. Minimal indexes. Is there a known best way to make Postgres=E2=80=99s planner more resilient= to > temporary statistic drift for parameterized queries like ours? Not really, other than giving Postgres updated stats via frequent analyze. PostgreSQL 14 / 15 tested. Moving to a newer version is always helpful too. Cheers, Greg -- Crunchy Data - https://www.crunchydata.com Enterprise Postgres Software Products & Tech Support --0000000000008764e5064239b5dd Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Oct 28, 2025 at 2:36=E2=80=AFAM G= iriraj Sharma <giriraj.sha= rma27@gmail.com> wrote:
Would daily range partitioning by effective_date (=E2=89=88 1000 partitio= ns for 3 years) be the right long-term approach here?

If you are querying based on dates, yes. Does not need= to be daily, could do monthly for example. A retention policy of 3 years i= s a great candidate for partitioning.
=C2=A0
Given that inserts and = queries almost always target today=E2=80=99s partition, will partition prun= ing and per-partition statistics fully eliminate the stale-statistics probl= em? Are there known performance implications of maintaining ~1000 daily par= titions at this scale (10 M/day)? We occasionally receive backdated or futu= re-dated inserts =E2=80=94 can such out-of-range values be handled efficien= tly (e.g., by creating partitions on the fly)?
<= br>
It will help,=C2=A0don't know about eliminate. I would no= t do daily=C2=A0unless it is really needed, that's a lot of partitions.= Monthly to start with. You cannot create partitions on the fly, but you ca= n have them go to a default partition and sort them out later. Or pre-creat= e a bunch of partitions.

If we stay with a single table, what a= re practical per-table autovacuum settings to ensure frequent ANALYZE even = as total row count grows into billions?

Turn off autovacuum for that table. Use cron to run vacuum and analy= ze hourly (or some frequency that ensures=C2=A0good plans). If analyze is t= aking too long, it can be done per-column as well, although that won't = help much if your sample table is representative. But if you have a lot of = other columns with stable values, you could analyze those less often. Measu= re and see.

Should we also experiment with raising the statisti= cs target for effective_date from 100 to 1000 using:
=

I don't think that will matter if your analyze is a= lready giving you index-only scans.

Are there best practices = or proven patterns for append-only, time-series=E2=80=93like workloads that= insert heavily into one day and read from the same day?

Partitioning. Ensure fillfactor is 100%. Minimal in= dexes.

PostgreSQL 14 / 15 tested.

=
Moving to a newer version is always helpful too.=C2=A0

<= /div>

Cheers,
Greg

--
Crunchy Data - = https://www.crunchydata.com
Enterprise Postgres Software Prod= ucts & Tech Support

--0000000000008764e5064239b5dd--