MIME-Version: 1.0
References: 
 <CANOng2i1G_57nvZ4ip4uKKU87jtt+fzqWUFV_ou6L8N3bteSXQ@mail.gmail.com>
In-Reply-To: 
 <CANOng2i1G_57nvZ4ip4uKKU87jtt+fzqWUFV_ou6L8N3bteSXQ@mail.gmail.com>
From: =?UTF-8?B?S2FzcGVyIEbDuG5z?= <kasper.fons@cloudkitchens.com>
Date: Mon, 1 Dec 2025 10:49:42 +0100
Message-ID: 
 <CANOng2i6xLa-FsN1B_rZFpW807GrV3YUJVgDM3nqJEj1gCk2dg@mail.gmail.com>
Subject: Fwd: restore_command on high-throughput cluster never switches to
 streaming replication
To: pgsql-general@lists.postgresql.org
Content-Type: multipart/alternative; boundary="000000000000a4c2170644e0e6d7"
Archived-At: 
 <https://www.postgresql.org/message-id/CANOng2i6xLa-FsN1B_rZFpW807GrV3YUJVgDM3nqJEj1gCk2dg%40mail.gmail.com>
Precedence: bulk

--000000000000a4c2170644e0e6d7
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi PostgreSQL community.

I debugged an instance where a PostgreSQL standby would not switch to
streaming replication when the `restore_command` fails.
I first posted this to pgsql-admin mailing list, but now trying here as I
got no response.

*Expectation*
I expect PostgreSQL to try switching to streaming replication if the
`restore_command` fails.

*What happens*
PostgreSQL attempts to restore the previously restored WAL segment and then
retries the failed segment. However, because the primary produces WAL at a
high rate, the WAL file now exists and PostgreSQL does not try to switch to
streaming replication.

*Context*
Running PostgreSQL 15.7 in Kubernetes using CloudNative PostgreSQL Operator=
.

*Logs*
I configured PostgreSQL to emit DEBUG3 level logs. Newest logs first,
oldest last.

got WAL segment from archive
executing restore command "/controller/manager wal-restore
--log-destination /controller/log/postgres.json *000000410000A7BA00000058*
pg_wal/RECOVERYXLOG"
got WAL segment from archive
executing restore command "/controller/manager wal-restore
--log-destination /controller/log/postgres.json *000000410000A7BA00000057*
pg_wal/RECOVERYXLOG"
could not open file "pg_wal/*000000410000A7BA00000058*": No such file or
directory
could not restore file "*000000410000A7BA00000058*" from archive: child
process exited with exit code 1
executing restore command "/controller/manager wal-restore
--log-destination /controller/log/postgres.json *000000410000A7BA00000058*
pg_wal/RECOVERYXLOG"
got WAL segment from archive
executing restore command "/controller/manager wal-restore
--log-destination /controller/log/postgres.json *000000410000A7BA00000057*
pg_wal/RECOVERYXLOG"

Notice that when *000000410000A7BA00000058* failed, PostgreSQL asked for
*000000410000A7BA00000057* which it had already restored. Aftwards, it asks
about *000000410000A7BA00000058* once again.

*Problem*
This is problematic because the standby will never switch to streaming
replication.

*Workaround*
We can get the PostgreSQL replica to become in-sync if we change the
command to `/bin/false` when we are withing `wal_keep_size`.

*Question*
Is this the expected behaviour?

I expect the function `WaitForWALToBecomeAvailable` to switch to streaming
replication once a single `restore_command` fails. This also happens when
`/bin/false` is used instead.

Any help would be greatly appreciated
/Kasper F=C3=B8ns

--000000000000a4c2170644e0e6d7
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote gmail_quote_container"><div dir=
=3D"ltr">Hi PostgreSQL community.<div><br></div><div>I debugged an instance=
 where a PostgreSQL standby would not switch to streaming replication when =
the `restore_command` fails.</div><div>I first posted this to pgsql-admin m=
ailing list, but now trying here as I got no response.</div><div><br></div>=
<div><b>Expectation</b></div><div>I expect PostgreSQL to try switching to s=
treaming replication if the `restore_command` fails.</div><div><br></div><d=
iv><b>What happens</b></div><div>PostgreSQL attempts to restore the previou=
sly restored WAL segment and then retries the failed segment. However, beca=
use the primary produces WAL at a high rate, the WAL file now exists and Po=
stgreSQL does not try to switch to streaming replication.</div><div><br></d=
iv><div><b>Context</b></div><div>Running PostgreSQL 15.7 in Kubernetes usin=
g CloudNative PostgreSQL Operator.</div><div><br></div><div><b>Logs</b></di=
v><div>I configured PostgreSQL to emit DEBUG3 level logs. Newest logs first=
, oldest last.</div><div><br></div><div><font face=3D"times new roman, seri=
f">got WAL segment from archive<br>executing restore command &quot;/control=
ler/manager wal-restore --log-destination /controller/log/postgres.json <b>=
000000410000A7BA00000058</b> pg_wal/RECOVERYXLOG&quot;<br>got WAL segment f=
rom archive<br>executing restore command &quot;/controller/manager wal-rest=
ore --log-destination /controller/log/postgres.json <b>000000410000A7BA0000=
0057</b> pg_wal/RECOVERYXLOG&quot;<br>could not open file &quot;pg_wal/<b>0=
00000410000A7BA00000058</b>&quot;: No such file or directory<br>could not r=
estore file &quot;<b>000000410000A7BA00000058</b>&quot; from archive: child=
 process exited with exit code 1<br>executing restore command &quot;/contro=
ller/manager wal-restore --log-destination /controller/log/postgres.json <b=
>000000410000A7BA00000058</b> pg_wal/RECOVERYXLOG&quot;<br>got WAL segment =
from archive<br>executing restore command &quot;/controller/manager wal-res=
tore --log-destination /controller/log/postgres.json <b>000000410000A7BA000=
00057</b> pg_wal/RECOVERYXLOG&quot;<br></font></div><div><br></div><div>Not=
ice that when=C2=A0<b>000000410000A7BA00000058</b> failed, PostgreSQL asked=
 for=C2=A0<b>000000410000A7BA00000057</b> which it had already restored. Af=
twards, it asks about=C2=A0<b>000000410000A7BA00000058</b> once again.</div=
><div><br></div><div><b>Problem</b></div><div><div>This is problematic beca=
use the standby will never switch to streaming replication.</div><div><br><=
/div><div><b>Workaround</b></div><div>We can get the PostgreSQL replica to =
become in-sync if we change the command to `/bin/false` when we are withing=
 `wal_keep_size`.</div><div><br></div><div><b>Question</b></div><div>Is thi=
s the expected behaviour?</div></div><div><br></div><div>I expect the funct=
ion `WaitForWALToBecomeAvailable` to switch to streaming replication once a=
 single `restore_command` fails. This also happens when `/bin/false` is use=
d instead.</div><div><br></div><div>Any help would be greatly appreciated</=
div><div>/Kasper F=C3=B8ns</div></div>
</div></div>

--000000000000a4c2170644e0e6d7--