public inbox for [email protected]  
help / color / mirror / Atom feed
From: Dennis White <[email protected]>
To: Pgsql-admin <[email protected]>
Subject: Logical replication slot wal_status "lost" with max_slot_wal_keep_size = -1
Date: Tue, 15 Oct 2024 11:36:26 -0400
Message-ID: <CAE=rie9H6p51y8S=nCqzY-3rv0rvgQJcx8Qjoy1ryyhdKcty-w@mail.gmail.com> (raw)

My project's replication is failing with the following error:

2024-10-15 14:03:38.446 UTC [2840947] STATEMENT:  SELECT
pg_catalog.set_config('search_path', '', false);
2024-10-15 14:03:38.446 UTC [2840947] ERROR:  cannot read from logical
replication slot "track_subscription"
2024-10-15 14:03:38.446 UTC [2840947] DETAIL:  This slot has been
invalidated because it exceeded the maximum reserved size.
2024-10-15 14:03:38.446 UTC [2840947] STATEMENT:  START_REPLICATION SLOT
"track_subscription" LOGICAL 1380B/CBFAEFF0 (proto_version '2',
publication_names '"track_ingestion"')


trackdb=# select * from pg_replication_slots;
     slot_name      |  plugin  | slot_type | datoid | database | temporary
| active | active_pid | xmin |
 catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status |
safe_wal_size | two_phase
--------------------+----------+-----------+--------+----------+-----------+--------+------------+------+
--------------+-------------+---------------------+------------+---------------+-----------
 track_subscription | pgoutput | logical   |  16402 | trackdb  | f
| f      |            |      |
    406428081 |             | 1380B/BAB7B328      | lost       |
    | f

Publisher and Subscriber DB versions:
PostgreSQL 14.12 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
20210514 (Red Hat 8.5.0-22), 64-bit

Publisher System settings:
max_slot_wal_keep_size = -1
max_wal_size = 12GB
wal_keep_size = 0

I have controls in place to prevent the replication lag from growing too
much but was surprised to see the wal_status become "lost" given what I
read about the default value for max_slot_keep_size.
My search of this problem suggests I should increase max_wal_size to 96GB
and perhaps set max_slot_wal_keep_size = 0.
Is this correct or is there something else I should do to prevent this from
*ever* happening again?

Thanks,
Dennis


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Logical replication slot wal_status "lost" with max_slot_wal_keep_size = -1
  In-Reply-To: <CAE=rie9H6p51y8S=nCqzY-3rv0rvgQJcx8Qjoy1ryyhdKcty-w@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox