public inbox for [email protected]
help / color / mirror / Atom feedFrom: Evgeniy Ratkov <[email protected]>
To: [email protected]
Subject: Re: BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done.
Date: Tue, 18 Mar 2025 16:27:48 +0300
Message-ID: <[email protected]> (raw)
On 08/09/2024 15:26, Heikki Linnakangas wrote:
> 2. Independently of pg_rewind: When you start PostgreSQL, it will first
> try to recover all the WAL it has locally in pg_wal. That goes wrong if
> you have set a recovery target TLI. For example, imagine this situation:
>
> - Recovery target TLI is 2, set explicitly in postgresql.conf
> - The switchpoint from TLI 1 to 2 happened at WAL position 0/1510198
> (the switchpoint is found in 00000002.history)
> - There is a WAL file 000000010000000000000001 under pg_wal, which
> contains valid WAL up to 0/1590000
>
> When you start the server, it will first recover all the WAL from
> 000000010000000000000001, up to 0/1590000. Then it will connect to the
> primary to fetch mor WAL, but it will fail to make any progress because
> it already blew past the switch point.
>
> It's obviously wrong to replay the WAL from timeline 1 beyond the 1->2
> switchpoint, when the recovery target is TLI 2. The attached
> 0003-Don-t-read-past-current-TLI-during-archive-recovery.patch fixes
> that. However, the logic to find the right WAL segment file and read the
> WAL is extremely complicated, and I don't feel comfortable that I got
> all the cases right. Review would be highly appreciated.
>
> The patch includes a test case to demonstrate the case, with no
> pg_rewind. It does include one "manual" step to copy a timeline history
> file into pg_wal, marked with XXX, however. So I'm not sure how possible
> this scenario is in production setups .
Hello, Heikki Linnakangas.
Your patch
0003-Don-t-read-past-current-TLI-during-archive-recovery.patch fixes
the problem with recovery backup on standby, which I described at
https://www.postgresql.org/message-id/acf3141b-c78d-4f28-8e15-92ed8144331e%40arenadata.io
This thread also contains the test, which may show the problem.
Thank you.
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected]
Subject: Re: BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done.
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox