public inbox for [email protected]  
help / color / mirror / Atom feed
From: Dennis White <[email protected]>
To: Pgsql-admin <[email protected]>
Subject: Logical replication stopped suddenly claiming wal_status lost when max_slot_wal_keep_size was unlimited
Date: Fri, 23 Aug 2024 17:29:07 -0400
Message-ID: <CAE=rie84d==WrWfLMxD-_8Gmavjh33rrRficpK_q0qg3==cC_Q@mail.gmail.com> (raw)

After running continuously for perhaps a year or more, my project's logical
replication stopped on our test DB this morning claiming wal was lost due
to size limits when there aren't any limits.

The system is running Centos7 and I was planning on moving to Rhel8 and
14.12 today, but so much for that.


Is this a bug that was fixed in a later release of 14?

Is there some other setting that must be set to get the wal retained?


Here are the details:

Version:

PostgreSQL 14.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-44), 64-bit


Log entries: (log entries that followed the last listed just continued to
say the slot was invalid)

2024-08-23 03:07:45.926 UTC [1121] LOG:  starting logical decoding for slot
"track_subscription"

2024-08-23 03:07:45.926 UTC [1121] DETAIL:  Streaming transactions
committing after AB17/4A0C9F40, reading WAL from AB17/46D98068.

2024-08-23 03:07:45.926 UTC [1121] STATEMENT:  START_REPLICATION SLOT
"track_subscription" LOGICAL AB17/554088B0 (proto_version '2',
publication_names '"track_ingestion"')

2024-08-23 03:07:45.926 UTC [1121] LOG:  logical decoding found consistent
point at AB17/46D98068

2024-08-23 03:07:45.926 UTC [1121] DETAIL:  There are no running
transactions.

2024-08-23 03:07:45.926 UTC [1121] STATEMENT:  START_REPLICATION SLOT
"track_subscription" LOGICAL AB17/554088B0 (proto_version '2',
publication_names '"track_ingestion"')

2024-08-23 03:08:17.161 UTC [48799] LOG:  terminating process 1121 to
release replication slot "track_subscription"

2024-08-23 03:08:17.161 UTC [1121] FATAL:  terminating connection due to
administrator command

2024-08-23 03:08:17.161 UTC [1121] CONTEXT:  slot "track_subscription",
output plugin "pgoutput", in the change callback, associated LSN
AB17/663138F0

2024-08-23 03:08:17.161 UTC [1121] STATEMENT:  START_REPLICATION SLOT
"track_subscription" LOGICAL AB17/554088B0 (proto_version '2',
publication_names '"track_ingestion"')

2024-08-23 03:08:17.190 UTC [1121] LOG:  disconnection: session time:
0:00:33.502 user=sysrep database=trackdb
host=postgresqldb03.s2a.nrl.navy.mil.31.250.132.in-addr.arpa port=36840

2024-08-23 03:08:17.195 UTC [48799] LOG:  invalidating slot
"track_subscription" because its restart_lsn AB17/4D0E3320 exceeds
max_slot_wal_keep_size


trackdb=# select * from pg_replication_slots;

     slot_name      |  plugin  | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phase

--------------------+----------+-----------+--------+-------
---+-----------+--------+------------+------+--------------+
-------------+---------------------+------------+---------------+-----------

track_subscription | pgoutput | logical   |  16386 | trackdb  | f         |
f      |            |      |    130568429 |             |
AB17/554088B0       | lost       |               | f

(1 row)



show max_slot_wal_keep_size;

max_slot_wal_keep_size

------------------------

-1

(1 row)


Thanks,


Dennis


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Logical replication stopped suddenly claiming wal_status lost when max_slot_wal_keep_size was unlimited
  In-Reply-To: <CAE=rie84d==WrWfLMxD-_8Gmavjh33rrRficpK_q0qg3==cC_Q@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox