public inbox for [email protected]  
help / color / mirror / Atom feed
From: Rafael Thofehrn Castro <[email protected]>
To: [email protected]
Subject: Inconsistent increment of pg_stat_database.xact_rollback with logical replication
Date: Fri, 14 Jun 2024 19:15:06 -0300
Message-ID: <CAG0ozMo_xWQn+Avv8jzbbhePGp5OnhdO+YWTkdg4faWSXz0Jzg@mail.gmail.com> (raw)

Column xact_rollback from pg_stat_database gets inconsistently incremented
when logical replication is being used (on publisher side).

This can be easily reproduced in latest code from master branch:

- Publisher

postgres=# select xact_commit, xact_rollback from pg_stat_database where
datname = 'postgres';

-[ RECORD 1 ]-+---

xact_commit   | 20

xact_rollback | 0


postgres=# insert into t1 values (1);

INSERT 0 1

postgres=# insert into t1 values (2);

INSERT 0 1

postgres=# insert into t1 values (3);

INSERT 0 1

postgres=# insert into t1 values (4);

INSERT 0 1

postgres=# insert into t1 values (5);

INSERT 0 1

postgres=# insert into t1 values (6);

INSERT 0 1

postgres=# insert into t1 values (7);

INSERT 0 1

postgres=# insert into t1 values (8);

INSERT 0 1

postgres=# insert into t1 values (9);

INSERT 0 1

postgres=# insert into t1 values (10);

INSERT 0 1


postgres=# select xact_commit, xact_rollback from pg_stat_database where
datname = 'postgres';

-[ RECORD 1 ]-+---

xact_commit   | 33

xact_rollback | 0


- Subscriber


postgres=# alter subscription sub disable;

ALTER SUBSCRIPTION


- Publisher


postgres=# select xact_commit, xact_rollback from pg_stat_database where
datname = 'postgres';

-[ RECORD 1 ]-+---

xact_commit   | 36

xact_rollback | 10


What seems to be happening is that the amount of transactions decoded by
the walsender are being added in pg_stat_database.xact_rollback. But these
changes are only flushed to global stats when the walsender gets terminated.

On a quick look look at the source I would suspect that the issue starts
here:
https://github.com/postgres/postgres/blob/master/src/backend/replication/logical/reorderbuffer.c#L25...

All decoded transactions are aborted for cleanup purposes. Following the
source code flow after calling AbortCurrentTransaction() we eventually
reach the part that increments rollback stats here:
https://github.com/postgres/postgres/blob/master/src/backend/utils/activity/pgstat_database.c#L249

This is causing inconsistency in monitoring TPS metric of a database where
we eventually see sudden spikes of TPS in the order of millions.

Regards,

Rafael Castro.


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Inconsistent increment of pg_stat_database.xact_rollback with logical replication
  In-Reply-To: <CAG0ozMo_xWQn+Avv8jzbbhePGp5OnhdO+YWTkdg4faWSXz0Jzg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox