Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wKXEX-0016Ak-0j for pgsql-bugs@arkaria.postgresql.org; Wed, 06 May 2026 08:03:53 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wKXEU-00FcaV-2E for pgsql-bugs@arkaria.postgresql.org; Wed, 06 May 2026 08:03:50 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wKXEU-00FcaN-15 for pgsql-bugs@lists.postgresql.org; Wed, 06 May 2026 08:03:50 +0000 Received: from mail-lf1-x136.google.com ([2a00:1450:4864:20::136]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wKXER-00000000mpe-3m8X for pgsql-bugs@lists.postgresql.org; Wed, 06 May 2026 08:03:50 +0000 Received: by mail-lf1-x136.google.com with SMTP id 2adb3069b0e04-5a87edf88b3so1842416e87.0 for ; Wed, 06 May 2026 01:03:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1778054625; cv=none; d=google.com; s=arc-20240605; b=LH9H0j6sDAeHkrGJNXcLTmFWv2cC4heBUZvAHy8m5nzgIxuTITpiloEQW7rL5xnX/g ktXHOUdTUi9GQk4wNUiGcbw9LWxwZJ9SyO4GFEPqASslA6gkcY2izLqLTTpLQqlM00I7 yj1rMYdQ4d+RcG+cMTTNu4gMcv6jFVyTS71B55riuCYqTIphS3rMHA2XHlx2ry5FTn/3 3TRjSp8NhycnEXeZqw9zAUqZjs875XjCl2EkpVM+XbNtHNUwzUsJ8Omu87Li5fwkfOsB BCyz01N/Rbbrlr55rXrRNa0gIDHf2IE6xBeWZfGih1onbUMMRqmuKmP8U4JDvXl5i/lF 8/4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=/79wcmhx0IeHn/1gMau+HvVEkRk7ul6kRQF4CTd2NVU=; fh=A59NFpreBY0YrsdxJfHw/vbr7O8dsQRT2aXVib0su8s=; b=kBVVa24KNgzoSLieatjKjNYjg3f0sSAtxjMyC7lIhHgpIRYDoC1swIuZVEHXka2Wlq 2oeX87AqIgEh2KRLYdFkbgxm2pNw+O2ZbKKKhMVM58H6n48/wcK5cYrE8j8E5epaJIGv +UsCMWkuYAZcLprC6jEZBWCQ5RFu9ICZ1s470/g+B1hQkU5ITstOT1HcNqfvKVIrLmZT hyXtW73eXloBNzxY3/naw3hssI0DxkUNWPvUMGwIYOb0RuiSIafIURLagjVFfozHwTaV GNPpdqJ4wKjAICjfsqoZhfoTmhfFEH5hQtfMd+qY0dX05ciUKtSZYicAj4ZPI9Ey9tI4 jebA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778054625; x=1778659425; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=/79wcmhx0IeHn/1gMau+HvVEkRk7ul6kRQF4CTd2NVU=; b=QK1e47WW0zNfsyG8RDgxBq2LBV2CMEt5iiUccti6ii2UbIgOMaliLRXq038FLXiSA7 9gNDaJo0F7PuZLw2A93HaXw8Jd/nyMG/OA3EZodi3PwdFZ0fXhdN71DRYehQI9H+Ur6i SXN/XW/W2MeP71ECCIH/N5MPEww9HSXcEHrgPW5rH5NQecQtYEgNGcBJCsNMv2XecLw1 c6kaBk6FyKrzqRmdurGlEfp/5lHPLlL3ZpkRxuCYrKiq8fLRmXQXGrn5yp8xWnckmE+W zfJ2ZgPcDj5iGXCiK4Ih7B/5XfBg6kA7tAIGeW3837AzCKJLVMV14EuLP6bUE+kP6yub rsyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778054625; x=1778659425; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=/79wcmhx0IeHn/1gMau+HvVEkRk7ul6kRQF4CTd2NVU=; b=jLVgl0Ha61+sY/cTneHUoIoQXJmbf+tQ/0W4dMORurqMO957mFLKJFyIfGJHcbbD6E +7ta4kKG79N9acpPnZAiEFXUnbnxaKKVoEkiDBkDsZdnL7jGKaUAdo5ng+qdk/d43/HM zMdEnxGCm/Rz/YGhGtoS3qZRWzlx5nc84xlBbHXwPHww/nkXFX3sU6SKx+tgaOMhaTvS U387yokj/kY7PPYkzyqZdYrJgHDhI/ObXxXdJc8sIepu2LwL3ICN78h02yne2z4gik6p Z+MsQawmtVAITs84LUa7eebmuIwGkQIa8M4QfDAiRd+Bi+gdtN2K+TivYlzLzSRopJ7g c4kg== X-Gm-Message-State: AOJu0YykeNQe46ZfAuqqO+CYOtVQLGdYFIkKS1I/ntM/Bvf+G1qqWhra KaYY7mzzPp5zIscedy2pTwkhiY/ps88WCjsdRSHrp+R0M/fzlVe0dfTT39UEdip6WrK5wUZjVTn p/9rA9vBVxeti3utxkiF+g06I+7SyefU= X-Gm-Gg: AeBDievQpFoYzUiDq+fp1p3s65119M9SreRHAu5zoeyhCA6Tfhs3izrzmitrjGfqdHp WOnoLDmGA/qdInZw6CCVMeiuKzbAGX9bGc2Hd1LhstlGSGSpQCs+rBIoAguELCqutHbhMN5gQrq q0mqqOR6EQAXqOIM5f8++bFTIQyGkFfQWlY7oiDi9usB9m5vnFkREA+ZODH96hM3SClM3CfHMaP O7t0nWqKZ/xx9MZp5aqGtQ+Ezc3/KpasJsG0iApzRT4RwGmfnNxlx+4xLl4W/FdK52vRNwrTQqs K/ziyRejjO17kV3wYdm+BFLTAIVl99rPb6zuoCFFctnuTcD3PjGNdGJUGmKMolIq/gAJ76+n9SU aOGDtrrGmYKVArtnArsX+NR45xBW3LmQ2nqD9Dw== X-Received: by 2002:a05:6512:3188:b0:5a3:f152:df65 with SMTP id 2adb3069b0e04-5a887ce5d06mr892748e87.20.1778054624859; Wed, 06 May 2026 01:03:44 -0700 (PDT) MIME-Version: 1.0 References: <19439-8f02018f9fc3a240@postgresql.org> In-Reply-To: From: klemen kobau Date: Wed, 6 May 2026 10:03:08 +0200 X-Gm-Features: AVHnY4JMVayet5Opa9mLdeoT48C8tSc55sbMyfdz8GhbYT3EC2dkS2fKe13k9eQ Message-ID: Subject: Re: BUG #19439: pg_stat_xact_user_tables stat not currect during the transaction To: Xuneng Zhou Cc: pgsql-bugs@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000004e5cb20651219a86" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000004e5cb20651219a86 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Xuneng, Thanks for the analysis and the patch proposals. As the original reporter, I wanted to follow up since the thread has been quiet for over a month. I am new to this process, do I need to do anything to help the process? Kind regards Klemen Kobau On Wed, 6 May 2026 at 09:56, Xuneng Zhou wrote: > On Sun, Mar 29, 2026 at 11:17=E2=80=AFAM Xuneng Zhou wrote: > > > > On Sat, Mar 28, 2026 at 1:47=E2=80=AFPM Xuneng Zhou > wrote: > > > > > > On Fri, Mar 27, 2026 at 10:15=E2=80=AFAM Xuneng Zhou > wrote: > > > > > > > > Hi klemen, > > > > > > > > Thanks for the report. > > > > > > > > On Fri, Mar 27, 2026 at 5:36=E2=80=AFAM PG Bug reporting form > > > > wrote: > > > > > > > > > > The following bug has been logged on the website: > > > > > > > > > > Bug reference: 19439 > > > > > Logged by: klemen kobau > > > > > Email address: klemen.kobau@gmail.com > > > > > PostgreSQL version: 18.0 > > > > > Operating system: Linux (EndeavorOS) > > > > > Description: > > > > > > > > > > I am running postgres:18.0 in a docker container, the > configuration is as > > > > > follows: > > > > > > > > > > postgres: > > > > > image: postgres:18.0 > > > > > command: [ > > > > > "postgres", > > > > > "-N", "200", > > > > > "-c", "max_prepared_transactions=3D100", > > > > > "-c", "wal_level=3Dlogical", > > > > > "-c", "max_wal_senders=3D10", > > > > > "-c", "max_replication_slots=3D20", > > > > > "-c", "wal_keep_size=3D10", > > > > > "-c", "max_slot_wal_keep_size=3D1024" > > > > > ] > > > > > environment: > > > > > POSTGRES_USER: postgres > > > > > POSTGRES_PASSWORD: postgres > > > > > POSTGRES_DB: postgres > > > > > TZ: UTC > > > > > PGTZ: UTC > > > > > ports: > > > > > - 5432:5432 > > > > > volumes: > > > > > - postgres_data:/var/lib/postgresql > > > > > > > > > > I use psql version 18.3. > > > > > > > > > > I run the following: > > > > > > > > > > psql -h localhost -p 5432 -U postgres -d postgres -c " > > > > > -- Transaction 1: insert 1 row, check stats, commit > > > > > BEGIN; > > > > > CREATE TABLE IF NOT EXISTS temp.xact_test (id serial PRIMARY KEY, > val text); > > > > > INSERT INTO temp.xact_test (val) VALUES ('a'); > > > > > SELECT relname, n_tup_ins FROM pg_stat_xact_user_tables WHERE > relname =3D > > > > > 'xact_test'; > > > > > COMMIT; > > > > > > > > > > -- Transaction 2: insert 1 row, check stats > > > > > BEGIN; > > > > > INSERT INTO temp.xact_test (val) VALUES ('b'); > > > > > SELECT relname, n_tup_ins FROM pg_stat_xact_user_tables WHERE > relname =3D > > > > > 'xact_test'; > > > > > ROLLBACK; > > > > > > > > > > -- Cleanup > > > > > DROP TABLE temp.xact_test; > > > > > " > > > > > > > > > > And the output is > > > > > > > > > > BEGIN > > > > > CREATE TABLE > > > > > INSERT 0 1 > > > > > relname | n_tup_ins > > > > > -----------+----------- > > > > > xact_test | 1 > > > > > (1 row) > > > > > > > > > > COMMIT > > > > > BEGIN > > > > > INSERT 0 1 > > > > > relname | n_tup_ins > > > > > -----------+----------- > > > > > xact_test | 2 > > > > > (1 row) > > > > > > > > > > ROLLBACK > > > > > DROP TABLE > > > > > > > > I can also reproduce this behavior on HEAD. > > > > > > > > > This is not what I would expect from reading > > > > > > https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-= STATS-VIEWS > , > > > > > where it states > > > > > > > > > > > A transaction can also see its own statistics (not yet flushed > out to the > > > > > shared memory statistics) in the views pg_stat_xact_all_tables, > > > > > pg_stat_xact_sys_tables, pg_stat_xact_user_tables, and > > > > > pg_stat_xact_user_functions. These numbers do not act as stated > above; > > > > > instead they update continuously throughout the transaction. > > > > > > > > > > based on this, I would expect that the numbers are updated each > time an > > > > > insert happens and that they are isolated per transaction. > > > > > > > > > > > > > This snippet of the doc feels somewhat ambiguous. The current > behavior > > > > does not seem fully consistent with wording such as =E2=80=9Cthroug= hout the > > > > transaction.=E2=80=9D It also seems more appropriate for these tabl= e views to > > > > reflect stats within their own scopes, rather than cumulative value= s > > > > spanning txn boundaries. > > > > > > > > > > I=E2=80=99ve looked into this issue and 'd like to propose a patch to= address > > > it. Feedback is very welcome. > > > > > > --- Root cause > > > > > > Since the stats subsystem was rewritten (commit 5891c7a8), each > > > backend keeps a hash of pending stats entries that persist across > > > transaction boundaries. Entries that flush successfully are deleted > > > from the backend-local pending list; entries not flushed remain > > > pending with their accumulated counters. Flushing is driven by > > > pgstat_report_stat(), called from the ReadyForQuery path, and subject > > > to a rate limiter. > > > > > > The pg_stat_xact_* views read these pending entries directly via > > > find_tabstat_entry() / find_funcstat_entry(). Both functions return > > > the raw accumulated counters without per-transaction scoping. > > > > > > The bug is deterministic when multiple top-level transactions are > > > processed within a single simple-query message, because there is no > > > ReadyForQuery boundary between the transactions and therefore no > > > opportunity to flush and remove the pending entry. > > > > > > --- Fix > > > > > > The patch introduces a per-entry "transaction baseline" =E2=80=94 a l= azy > > > snapshot of the counters taken the first time an entry is touched in > > > each new top-level transaction. The accessor functions > > > (find_tabstat_entry, find_funcstat_entry) then subtract the baseline > > > from the current counters, yielding only the current transaction's > > > delta. > > > > > > The baseline is keyed by MyProc->vxid.lxid. A static inline helper, > > > pgstat_ensure_xact_baseline(), is called at every nontransactional > > > counter-increment site (the 7 event-counter macros in pgstat.h and th= e > > > 4 non-inline counter functions in pgstat_relation.c). After the first > > > call per entry per transaction, the check reduces to a single integer > > > comparison with a predictably not-taken branch. > > > > > > For function stats, a new PgStat_FunctionPending wrapper struct embed= s > > > PgStat_FunctionCounts at offset 0 with the baseline fields appended, > > > so the flush callback requires only a trivial cast change. > > > > > > --- Testing > > > > > > The regression test is a TAP test under src/bin/psql/t/ that uses psq= l > > > -c to send multi-statement strings as single simple-query messages. > > > src/bin/psql/t look like the right existing harness for the primary > > > regression because psql -c sends the whole multi-statement string as > > > one simple-query message via simple_action_list dispatch. By contrast= , > > > ordinary pg_regress SQL files are executed by psql in file/stdin mode= , > > > which runs through MainLoop() and dispatches top-level statements one > > > at a time via SendQuery(), creating ReadyForQuery boundaries that > > > could mask the bug. > > > > > > The test covers three scenarios: table n_tup_ins and seq_scan > > > counters, and function calls =E2=80=94 each verifying isolation acros= s > > > consecutive top-level transactions within a single message. > > > > > > > postgres % cat output.txt > > BEGIN > > CREATE TABLE > > INSERT 0 1 > > relname | n_tup_ins > > -----------+----------- > > xact_test | 1 > > (1 row) > > > > COMMIT > > BEGIN > > INSERT 0 1 > > relname | n_tup_ins > > -----------+----------- > > xact_test | 2 > > (1 row) > > > > ROLLBACK > > DROP TABLE > > > > I can also reproduce this behavior in v13. So this issue pre-existed > > the major refactor 5891c7a8. > > > > After further thought of v1, I'd like to also propose an alternative > approach. The root cause and testing rationale are the same as > described upthread; what changes is how the baseline is established. > > --- Problems with the lazy (v1) approach > The v1 patch snapshots the baseline lazily -- on every > nontransactional counter increment, it checks whether baseline_lxid > matches the current > transaction and snapshots if not. This has two drawbacks: > > 1) Hot-path overhead > Every pgstat_count_heap_scan(), pgstat_count_buffer_read_block(), etc. > acquires a branch. After the first call per entry per transaction, the > branch is predictably not-taken, but it could still present on some of > the most frequently executed paths in the backend. > > 2) Maintenance burden > Every new counter added to PgStat_TableCounts requires a corresponding > baseline snapshot at its increment site. Miss one, and that counter > silently leaks across transactions > > --- eager baseline sweep > The attached patch records the baseline eagerly at transaction > boundaries instead of lazily at counter-increment sites. > pgstat_set_pending_baselines() iterates the pgStatPending list and > snapshots each entry's current counts into an xact_baseline field via > struct assignment. It is called from AtEOXact_PgStat() (after folding > transactional counts and removing dropped entries) and from > PostPrepare_PgStat() (after relation cleanup), covering commit, abort, > and PREPARE TRANSACTION. The view accessors unconditionally subtract > the baseline. For entries created in the current transaction, > xact_baseline is zero-initialized, so the subtraction is a no-op. > > I don=E2=80=99t have a clear preference between the two approaches; both = are > presented for review. > > -- > Best, > Xuneng > --0000000000004e5cb20651219a86 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Xuneng,

Thanks for the analysis and the pat= ch proposals.=20 As the original reporter, I wanted to follow up since the thread has=20 been quiet for over a month.

I am new to this proc= ess, do I need to do anything to help the=C2=A0process?

Kind=C2=A0regards

Klemen Kobau


On Wed, 6 May 2026 at 09:56, Xuneng Zhou <xunengzhou@gmail.com> wrote:
On Sun, Mar 29, 2026 at 11:= 17=E2=80=AFAM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> On Sat, Mar 28, 2026 at 1:47=E2=80=AFPM Xuneng Zhou <xunengzhou@gmail.com> wr= ote:
> >
> > On Fri, Mar 27, 2026 at 10:15=E2=80=AFAM Xuneng Zhou <xunengzhou@gmail.com= > wrote:
> > >
> > > Hi klemen,
> > >
> > > Thanks for the report.
> > >
> > > On Fri, Mar 27, 2026 at 5:36=E2=80=AFAM PG Bug reporting for= m
> > > <noreply@postgresql.org> wrote:
> > > >
> > > > The following bug has been logged on the website:
> > > >
> > > > Bug reference:=C2=A0 =C2=A0 =C2=A0 19439
> > > > Logged by:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 klemen kob= au
> > > > Email address:=C2=A0 =C2=A0 =C2=A0 klemen.kobau@gmail.com
> > > > PostgreSQL version: 18.0
> > > > Operating system:=C2=A0 =C2=A0Linux (EndeavorOS)
> > > > Description:
> > > >
> > > > I am running postgres:18.0 in a docker container, the c= onfiguration is as
> > > > follows:
> > > >
> > > >=C2=A0 =C2=A0postgres:
> > > >=C2=A0 =C2=A0 =C2=A0image: postgres:18.0
> > > >=C2=A0 =C2=A0 =C2=A0command: [
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"postgres",
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"-N", "200&quo= t;,
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"-c", "max_pre= pared_transactions=3D100",
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"-c", "wal_lev= el=3Dlogical",
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"-c", "max_wal= _senders=3D10",
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"-c", "max_rep= lication_slots=3D20",
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"-c", "wal_kee= p_size=3D10",
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"-c", "max_slo= t_wal_keep_size=3D1024"
> > > >=C2=A0 =C2=A0 =C2=A0]
> > > >=C2=A0 =C2=A0 =C2=A0environment:
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0POSTGRES_USER: postgres
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0POSTGRES_PASSWORD: postgres > > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0POSTGRES_DB: postgres
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0TZ: UTC
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0PGTZ: UTC
> > > >=C2=A0 =C2=A0 =C2=A0ports:
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0- 5432:5432
> > > >=C2=A0 =C2=A0 =C2=A0volumes:
> > > >=C2=A0 =C2=A0 =C2=A0 =C2=A0- postgres_data:/var/lib/post= gresql
> > > >
> > > > I use psql version 18.3.
> > > >
> > > > I run the following:
> > > >
> > > > psql -h localhost -p 5432 -U postgres -d postgres -c &q= uot;
> > > > -- Transaction 1: insert 1 row, check stats, commit
> > > > BEGIN;
> > > > CREATE TABLE IF NOT EXISTS temp.xact_test (id serial PR= IMARY KEY, val text);
> > > > INSERT INTO temp.xact_test (val) VALUES ('a');<= br> > > > > SELECT relname, n_tup_ins FROM pg_stat_xact_user_tables= WHERE relname =3D
> > > > 'xact_test';
> > > > COMMIT;
> > > >
> > > > -- Transaction 2: insert 1 row, check stats
> > > > BEGIN;
> > > > INSERT INTO temp.xact_test (val) VALUES ('b');<= br> > > > > SELECT relname, n_tup_ins FROM pg_stat_xact_user_tables= WHERE relname =3D
> > > > 'xact_test';
> > > > ROLLBACK;
> > > >
> > > > -- Cleanup
> > > > DROP TABLE temp.xact_test;
> > > > "
> > > >
> > > > And the output is
> > > >
> > > > BEGIN
> > > > CREATE TABLE
> > > > INSERT 0 1
> > > >=C2=A0 =C2=A0relname=C2=A0 | n_tup_ins
> > > > -----------+-----------
> > > >=C2=A0 xact_test |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01 > > > > (1 row)
> > > >
> > > > COMMIT
> > > > BEGIN
> > > > INSERT 0 1
> > > >=C2=A0 =C2=A0relname=C2=A0 | n_tup_ins
> > > > -----------+-----------
> > > >=C2=A0 xact_test |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02 > > > > (1 row)
> > > >
> > > > ROLLBACK
> > > > DROP TABLE
> > >
> > > I can also reproduce this behavior on HEAD.
> > >
> > > > This is not what I would expect from reading
> > > > https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORIN= G-STATS-VIEWS,
> > > > where it states
> > > >
> > > > > A transaction can also see its own statistics (not= yet flushed out to the
> > > > shared memory statistics) in the views pg_stat_xact_all= _tables,
> > > > pg_stat_xact_sys_tables, pg_stat_xact_user_tables, and<= br> > > > > pg_stat_xact_user_functions. These numbers do not act a= s stated above;
> > > > instead they update continuously throughout the transac= tion.
> > > >
> > > > based on this, I would expect that the numbers are upda= ted each time an
> > > > insert happens and that they are isolated per transacti= on.
> > > >
> > >
> > > This snippet of the doc feels somewhat ambiguous. The curren= t behavior
> > > does not seem fully consistent with wording such as =E2=80= =9Cthroughout the
> > > transaction.=E2=80=9D It also seems more appropriate for the= se table views to
> > > reflect stats within their own scopes, rather than cumulativ= e values
> > > spanning txn boundaries.
> > >
> >
> > I=E2=80=99ve looked into this issue and 'd like to propose a = patch to address
> > it. Feedback is very welcome.
> >
> > --- Root cause
> >
> > Since the stats subsystem was rewritten (commit 5891c7a8), each > > backend keeps a hash of pending stats entries that persist across=
> > transaction boundaries. Entries that flush successfully are delet= ed
> > from the backend-local pending list; entries not flushed remain > > pending with their accumulated counters. Flushing is driven by > > pgstat_report_stat(), called from the ReadyForQuery path, and sub= ject
> > to a rate limiter.
> >
> > The pg_stat_xact_* views read these pending entries directly via<= br> > > find_tabstat_entry() / find_funcstat_entry(). Both functions retu= rn
> > the raw accumulated counters without per-transaction scoping.
> >
> > The bug is deterministic when multiple top-level transactions are=
> > processed within a single simple-query message, because there is = no
> > ReadyForQuery boundary between the transactions and therefore no<= br> > > opportunity to flush and remove the pending entry.
> >
> > --- Fix
> >
> > The patch introduces a per-entry "transaction baseline"= =E2=80=94 a lazy
> > snapshot of the counters taken the first time an entry is touched= in
> > each new top-level transaction. The accessor functions
> > (find_tabstat_entry, find_funcstat_entry) then subtract the basel= ine
> > from the current counters, yielding only the current transaction&= #39;s
> > delta.
> >
> > The baseline is keyed by MyProc->vxid.lxid. A static inline he= lper,
> > pgstat_ensure_xact_baseline(), is called at every nontransactiona= l
> > counter-increment site (the 7 event-counter macros in pgstat.h an= d the
> > 4 non-inline counter functions in pgstat_relation.c). After the f= irst
> > call per entry per transaction, the check reduces to a single int= eger
> > comparison with a predictably not-taken branch.
> >
> > For function stats, a new PgStat_FunctionPending wrapper struct e= mbeds
> > PgStat_FunctionCounts at offset 0 with the baseline fields append= ed,
> > so the flush callback requires only a trivial cast change.
> >
> > --- Testing
> >
> > The regression test is a TAP test under src/bin/psql/t/ that uses= psql
> > -c to send multi-statement strings as single simple-query message= s.
> > src/bin/psql/t look like the right existing harness for the prima= ry
> > regression because psql -c sends the whole multi-statement string= as
> > one simple-query message via simple_action_list dispatch. By cont= rast,
> > ordinary pg_regress SQL files are executed by psql in file/stdin = mode,
> > which runs through MainLoop() and dispatches top-level statements= one
> > at a time via SendQuery(), creating ReadyForQuery boundaries that=
> > could mask the bug.
> >
> > The test covers three scenarios: table n_tup_ins and seq_scan
> > counters, and function calls =E2=80=94 each verifying isolation a= cross
> > consecutive top-level transactions within a single message.
> >
>
>=C2=A0 postgres % cat output.txt
> BEGIN
> CREATE TABLE
> INSERT 0 1
>=C2=A0 =C2=A0relname=C2=A0 | n_tup_ins
> -----------+-----------
>=C2=A0 xact_test |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01
> (1 row)
>
> COMMIT
> BEGIN
> INSERT 0 1
>=C2=A0 =C2=A0relname=C2=A0 | n_tup_ins
> -----------+-----------
>=C2=A0 xact_test |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02
> (1 row)
>
> ROLLBACK
> DROP TABLE
>
> I can also reproduce this behavior in v13. So this issue pre-existed > the major refactor 5891c7a8.
>

After further thought of v1, I'd like to also propose an alternative approach. The root cause and testing rationale are the same as
described upthread; what changes is how the baseline is established.

--- Problems with the lazy (v1) approach
The v1 patch snapshots the baseline lazily -- on every
nontransactional counter increment, it checks whether baseline_lxid
matches the current
transaction and snapshots if not. This has two drawbacks:

1) Hot-path overhead
Every pgstat_count_heap_scan(), pgstat_count_buffer_read_block(), etc.
acquires a branch. After the first call per entry per transaction, the
branch is predictably not-taken, but it could still present on some of
the most frequently executed paths in the backend.

2) Maintenance burden
Every new counter added to PgStat_TableCounts requires a corresponding
baseline snapshot at its increment site. Miss one, and that counter
silently leaks across transactions

--- eager baseline sweep
The attached patch records the baseline eagerly at transaction
boundaries instead of lazily at counter-increment sites.
pgstat_set_pending_baselines() iterates the pgStatPending list and
snapshots each entry's current counts into an xact_baseline field via struct assignment. It is called from AtEOXact_PgStat() (after folding
transactional counts and removing dropped entries) and from
PostPrepare_PgStat() (after relation cleanup), covering commit, abort,
and PREPARE TRANSACTION. The view accessors unconditionally subtract
the baseline. For entries created in the current transaction,
xact_baseline is zero-initialized, so the subtraction is a no-op.

I don=E2=80=99t have a clear preference between the two approaches; both ar= e
presented for review.

--
Best,
Xuneng
--0000000000004e5cb20651219a86--