Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDdJf-00Cg6S-FW for pgsql-general@arkaria.postgresql.org; Tue, 28 Oct 2025 06:36:22 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vDdJc-00B3qp-Un for pgsql-general@arkaria.postgresql.org; Tue, 28 Oct 2025 06:36:19 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDdJc-00B3qh-GD for pgsql-general@lists.postgresql.org; Tue, 28 Oct 2025 06:36:19 +0000 Received: from mail-ed1-x530.google.com ([2a00:1450:4864:20::530]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vDdJY-004ePM-31 for pgsql-general@lists.postgresql.org; Tue, 28 Oct 2025 06:36:19 +0000 Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-63c12ff0c5eso11014118a12.0 for ; Mon, 27 Oct 2025 23:36:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761633375; x=1762238175; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=FtfJEajz/65qX6N9YFsQsKOUKs7gVZHCtFAN4onjC68=; b=Rve0Pmr4uP2kfA+lwUbx3jWbnd58VcXxlsdReLYNIYmGqc+VnT9EoChEF14MQGY52w EUxLe1Eg78dLyWZNG5vN2kJybO/TUZ5jODBww/Z3n61Qgx83H0u5BzH1MvbyLROSy39S W/5NVImmgplbYv4pQSGWrXU8oyJI8nNGXPHsdMGuTqVqR6Vrio2viLUSxOsd3BFLPc39 hMh4iyNPEM8XCVIYEmYsA5rLffVwX9xJmpTMipEX95iQ15Za8aOnkulvElRIsvu0JUoA gPM59oyhimpXbvPgP9/2LLgUYL11uFPo86zM9ZcoCMxfJOV1dYT75fRx2U6q8CUHnZLx mshg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761633375; x=1762238175; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=FtfJEajz/65qX6N9YFsQsKOUKs7gVZHCtFAN4onjC68=; b=fnaTfuZfmfO0xHJPv9gdke60FdM2R191zmqwBK+kK2w2z6T+eHxdVo3VJJigwDHxGQ ao+7VgymJ1ufKt9GhBt/CTvfdy7krxFsditEHxfDbdmSchDne9vEEPThv32wdj8WRsFO 296f3TSEzs8C8Z9jodq8I9UhnX0xppyrv1YkebEOIkw7VBQRBoh0b9htpQB9T9ADRY/2 Ps3MIAgw7uu5fkZ9ENt/D1tLHleoozrAAE1seEj5xehELnYiwOpIx4/OqDlOqHdriAlz QHpZxncOrqa/zDhl+tCCzoitlT/6tdMSXMyda6ijTI36xmoxhbORV2RwJXvRvEQeXJEf SXgA== X-Gm-Message-State: AOJu0YyWhAUey/kniHOtX1BrHB2JEr5qqIBhQbw8qaH5Dya8iZ1EQNtz KsJJ8pRaPieH05ruV8eRO92rO1c9tkHzYMr8mB1jEczXyJ6Ylby3NUL/B5ng+qidVLvImFXW1jw d9FQeNlWqCN/Wjn99VFBnnFV9eJ+CuZZMBpSh X-Gm-Gg: ASbGncvLEAJ1YRx+L6IxiHck9p19hu2xwSgOMMP5ksgOcFKgwW3ZGTo71E/pVs9AzF/ EE0v4vPzlx72GDyJOgqdgmD8RHn8xIh2vLZQR2CgPU9CgxzTDMhG0fnZ2I4yQEXDEg4TQABe+s+ XuKTNR9bJwHn8GkjTkapB4MmFx239l/4lZ8iw2w3tBSvB781SswvLIh9f2PPdJFQEHz5ScOmUkQ 464Yyy5xVXo+E7v0ysCN+ETp4K37gSZyhjzB7OmAScTWRW1RF/6/JjgTKqyKTmddw== X-Google-Smtp-Source: AGHT+IF9ZlSwg/tFlqtmYQ8FdXw+hpcFKe3SB14hgkwRdqvIxCAUDezMcrXQVo8zsujaUF4ZPCgy0cuoCC+o+ZMHXF8= X-Received: by 2002:a05:6402:13ca:b0:639:f648:1093 with SMTP id 4fb4d7f45d1cf-63f4bca7180mr1640973a12.4.1761633374865; Mon, 27 Oct 2025 23:36:14 -0700 (PDT) MIME-Version: 1.0 From: Giriraj Sharma Date: Tue, 28 Oct 2025 12:06:02 +0530 X-Gm-Features: AWmQ_bn7S5Kezytg6i-CJHiHofZ8WQdqnudwKxu0Fda_hS-F-vGQBo4s8polXfw Message-ID: Subject: Query planner instability on large append-only table with heavy same-day inserts To: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000888b650642323baf" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000888b650642323baf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Environment PostgreSQL 14 / 15 tested. RDS / Aurora Postgres in production. Primary key index (effective_date, idempotency_id). Autovacuum enabled. SQLC (Go) for query generation. ________________________________ Context We are running into query planner instability on an append-only table that records daily commit information. The workload is heavily skewed toward the current day=E2=80=99s data, and q= uery performance deteriorates periodically until we run ANALYZE. ________________________________ Table Definition CREATE TABLE sample_table ( idempotency_id transactions.string NOT NULL, effective_date date NOT NULL, created_at timestamptz DEFAULT now() NOT NULL, CONSTRAINT commits_v3_pk PRIMARY KEY (effective_date, idempotency_id) ); ________________________________ Query Pattern All our lookups are parameterized (SQLC, Go): SELECT idempotency_id FROM sample_table WHERE effective_date =3D @effective_date AND idempotency_id IN ( SELECT unnest(@idempotency_ids::transactions.string[]) ); Typical characteristics: @effective_date almost always =3D current date. @idempotency_ids varies between 2=E2=80=9350 items off-peak and 800=E2=80= =931000 items at peak hours. Query is latency-sensitive (expected 5=E2=80=9310 ms). ________________________________ Workload Characteristics ~10=E2=80=9312 million inserts per weekday, almost entirely for the current= date. Practically append-only =E2=80=94 no updates or deletes. No weekend inserts. Occasional rare inserts for past or future dates (late or early trades). Retention: ~3 years of data (~1000 days =C3=97 10 M =3D 10+ billion rows). PostgreSQL 14+. ________________________________ Observed Behavior Immediately after ANALYZE, this query uses an Index Only Scan on the primary key (effective_date, idempotency_id) with stable latency around 5=E2=80=9310 ms. After several days (5=E2=80=937 typically), the query planner flips to a Se= quential Scan, and latency jumps to 2=E2=80=9330 seconds. Running ANALYZE transactions.commits_v3; restores performance instantly. We currently run manual ANALYZE twice a day via pg_cron, which helps but doesn=E2=80=99t fully guarantee stability (especially in test environments = where insert patterns are more random). ________________________________ What We=E2=80=99ve Tried Manual and scheduled ANALYZE runs (twice a day, each run takes ~30s) =E2=86= =92 improves performance but not sustainable long-term. Verified query execution plans before and after ANALYZE =E2=80=94 planner s= witches from Index Only Scan to Seq Scan as statistics become stale. Confirmed table=E2=80=99s autovacuum is running (last one occurred 15 days = ago), but its frequency isn=E2=80=99t sufficient to keep stats current during hig= h insert periods. Could this be an issue at all given that we do run ANALYZE at-least twice ? _______________________________ Problem Summary The planner=E2=80=99s row-count estimates for effective_date and idempotenc= y_id become inaccurate as we continuously append to =E2=80=9Ctoday=E2=80=99s=E2= =80=9D date. The result is plan instability (index scan =E2=86=94 sequential scan) until statistics are refreshed. We=E2=80=99re looking for a solution that keeps plans stable without manual= ANALYZE as data volume scales. ________________________________ Questions for the Community 1. Partitioning Would daily range partitioning by effective_date (=E2=89=88 1000 partitions= for 3 years) be the right long-term approach here? Given that inserts and queries almost always target today=E2=80=99s partiti= on, will partition pruning and per-partition statistics fully eliminate the stale-statistics problem? Are there known performance implications of maintaining ~1000 daily partitions at this scale (10 M/day)? We occasionally receive backdated or future-dated inserts =E2=80=94 can such out-of-range values be handled efficiently (e.g., by creating partitions on the fly)? 2. Autovacuum / Analyze Tuning If we stay with a single table, what are practical per-table autovacuum settings to ensure frequent ANALYZE even as total row count grows into billions? Would it make sense to use: ALTER TABLE transactions.commits_v3 SET ( autovacuum_analyze_scale_factor =3D 0, autovacuum_analyze_threshold =3D 50000, autovacuum_vacuum_scale_factor =3D 0.05, autovacuum_vacuum_threshold =3D 10000 ); to decouple analyze frequency from table size? Should we also experiment with raising the statistics target for effective_date from 100 to 1000 using: ALTER TABLE transactions.commits_v3 ALTER COLUMN effective_date SET STATISTICS 1000; to improve the planner=E2=80=99s histogram accuracy for the date distributi= on? 3. Best Practices Are there best practices or proven patterns for append-only, time-series=E2=80=93like workloads that insert heavily into one day and rea= d from the same day? Is there a known best way to make Postgres=E2=80=99s planner more resilient= to temporary statistic drift for parameterized queries like ours? ________________________________ Goal We=E2=80=99d like a =E2=80=9Cset it and forget it=E2=80=9D architecture =E2= =80=94 either through partitioning or robust autovacuum tuning =E2=80=94 where: The planner always chooses the index scan for same-day queries. We no longer need manual ANALYZE (that runs via pg_cron). Query latency remains in the 5=E2=80=9310 ms range, even as total data volu= me grows into billions of rows. ________________________________ Any recommendations, benchmark references, or production-proven strategies for this workload would be highly appreciated. --=20 Giriraj Sharma about.me/girirajsharma --000000000000888b650642323baf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Environment

PostgreSQL 14 / 15 tested.
RDS / Aur= ora Postgres in production.
Primary key index (effective_date, idempoten= cy_id).
Autovacuum enabled.
SQLC (Go) for query generation.
______= __________________________
Context

We are running into query plan= ner instability on an append-only table that records daily commit informati= on.
The workload is heavily skewed toward the current day=E2=80=99s data= , and query performance deteriorates periodically until we run ANALYZE.
=
________________________________
Table Definition

CREATE TABLE sample_table (
=C2=A0 =C2=A0 idempotency_id = transactions.string NOT NULL,
=C2=A0 =C2=A0 effective_date date NOT NULL= ,
=C2=A0 =C2=A0 created_at timestamptz DEFAULT now() NOT NULL,
=C2=A0= =C2=A0 CONSTRAINT commits_v3_pk PRIMARY KEY (effective_date, idempotency_i= d)
);

________________________________
Query Pattern
All our lookups are parameterized (SQLC, Go):

SELECT idempotency_id
FROM sample_table
WHERE effective_date = =3D @effective_date
=C2=A0 AND idempotency_id IN (
=C2=A0 =C2=A0 =C2= =A0 SELECT unnest(@idempotency_ids::transactions.string[])
=C2=A0 );

Typical characteristics:

@effective_date almost always = =3D current date.
@idempotency_ids varies between 2=E2=80=9350 items off= -peak and 800=E2=80=931000 items at peak hours.
Query is latency-sensiti= ve (expected 5=E2=80=9310 ms).

________________________________
W= orkload Characteristics

~10=E2=80=9312 million inserts per weekday, = almost entirely for the current date.

Practically append-only =E2=80= =94 no updates or deletes.
No weekend inserts.
Occasional rare insert= s for past or future dates (late or early trades).
Retention: ~3 years o= f data (~1000 days =C3=97 10 M =3D 10+ billion rows).
PostgreSQL 14+.
________________________________
Observed Behavior

Immediate= ly after ANALYZE, this query uses an Index Only Scan on the primary key (ef= fective_date, idempotency_id) with stable latency around 5=E2=80=9310 ms.
After several days (5=E2=80=937 typically), the query planner flips t= o a Sequential Scan, and latency jumps to 2=E2=80=9330 seconds.

Runn= ing ANALYZE transactions.commits_v3; restores performance instantly.
We currently run manual ANALYZE twice a day via pg_cron, which helps but d= oesn=E2=80=99t fully guarantee stability (especially in test environments w= here insert patterns are more random).

_____________________________= ___
What We=E2=80=99ve Tried

Manual and scheduled ANALYZE runs (t= wice a day, each run takes ~30s) =E2=86=92 improves performance but not sus= tainable long-term.
Verified query execution plans before and after ANAL= YZE =E2=80=94 planner switches from Index Only Scan to Seq Scan as statisti= cs become stale.
Confirmed table=E2=80=99s autovacuum is running (last o= ne occurred 15 days ago), but its frequency isn=E2=80=99t sufficient to kee= p stats current during high insert periods. Could this be an issue at all g= iven that we do run ANALYZE at-least twice ?

_______________________= ________
Problem Summary

The planner=E2=80=99s row-count estimate= s for effective_date and idempotency_id become inaccurate as we continuousl= y append to =E2=80=9Ctoday=E2=80=99s=E2=80=9D date.
The result is plan i= nstability (index scan =E2=86=94 sequential scan) until statistics are refr= eshed.
We=E2=80=99re looking for a solution that keeps plans stable with= out manual ANALYZE as data volume scales.

__________________________= ______

Questions for the Community

1. Partitioning

Wou= ld daily range partitioning by effective_date (=E2=89=88 1000 partitions fo= r 3 years) be the right long-term approach here?

Given that inserts = and queries almost always target today=E2=80=99s partition, will partition = pruning and per-partition statistics fully eliminate the stale-statistics p= roblem? Are there known performance implications of maintaining ~1000 daily= partitions at this scale (10 M/day)? We occasionally receive backdated or = future-dated inserts =E2=80=94 can such out-of-range values be handled effi= ciently (e.g., by creating partitions on the fly)?

2. Autovacuum / A= nalyze Tuning

If we stay with a single table, what are practical per= -table autovacuum settings to ensure frequent ANALYZE even as total row cou= nt grows into billions?

Would it make sense to use:

ALTER TABLE transactions.commits_v3
=C2=A0 SET (
=C2= =A0 =C2=A0 autovacuum_analyze_scale_factor =3D 0,
=C2=A0 =C2=A0 autovacu= um_analyze_threshold =3D 50000,
=C2=A0 =C2=A0 autovacuum_vacuum_scale_fa= ctor =3D 0.05,
=C2=A0 =C2=A0 autovacuum_vacuum_threshold =3D 10000
= =C2=A0 );


to decouple analyze frequency from table size?
<= br>Should we also experiment with raising the statistics target for effecti= ve_date from 100 to 1000 using:

ALTER TABLE= transactions.commits_v3
=C2=A0 ALTER COLUMN effective_date SET STATISTI= CS 1000;

to improve the planner=E2=80=99s histogram accuracy = for the date distribution?

3. Best Practices

Are there best p= ractices or proven patterns for append-only, time-series=E2=80=93like workl= oads that insert heavily into one day and read from the same day?

Is= there a known best way to make Postgres=E2=80=99s planner more resilient t= o temporary statistic drift for parameterized queries like ours?

___= _____________________________

Goal

We=E2=80=99d like a =E2=80= =9Cset it and forget it=E2=80=9D architecture =E2=80=94 either through part= itioning or robust autovacuum tuning =E2=80=94 where:
The planner always= chooses the index scan for same-day queries.
We no longer need manual A= NALYZE (that runs via pg_cron).
Query latency remains in the 5=E2=80=931= 0 ms range, even as total data volume grows into billions of rows.

_= _______________________________

Any recommendations, benchmark refer= ences, or production-proven strategies for this workload would be highly ap= preciated.


--
=C2=A0
Giriraj Sharma
about.me/girirajsharma
=C2=A0
=C2=A0 --000000000000888b650642323baf--