MIME-Version: 1.0
References: <CAGqBcTbWpCuCRuWHd9209k2Q0CHA4j+2MnRR2SwszCXnErG9hw@mail.gmail.com>
In-Reply-To: <CAGqBcTbWpCuCRuWHd9209k2Q0CHA4j+2MnRR2SwszCXnErG9hw@mail.gmail.com>
From: =?UTF-8?Q?Torsten_F=C3=B6rtsch?= <tfoertsch123@gmail.com>
Date: Thu, 7 Nov 2024 10:46:49 +0100
Message-ID: <CAKkG4_kt9zZNPwB2Qd0-jAtRn3m=JPWW_-hc8VSyhfZ+VjfKQw@mail.gmail.com>
Subject: Re: Trouble using pg_rewind to undo standby promotion
To: Craig McIlwee <craigm@vt.edu>
Cc: pgsql-general@lists.postgresql.org
Content-Type: multipart/alternative; boundary="00000000000013939206264f84ef"
Archived-At: <https://www.postgresql.org/message-id/CAKkG4_kt9zZNPwB2Qd0-jAtRn3m%3DJPWW_-hc8VSyhfZ%2BVjfKQw%40mail.gmail.com>
Precedence: bulk

--00000000000013939206264f84ef
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Your point of divergence is in the middle of the 7718/000000BF file. So,
you should have 2 such files eventually, one on timeline 1 and the other on
timeline 2.

Are you archiving WAL on the promoted machine in a way that your
restore_command can find it? Check archive_command and archive_mode on the
promoted machine.

Also, do your archive/restore scripts work properly for history files?

On Wed, Nov 6, 2024 at 7:48=E2=80=AFPM Craig McIlwee <craigm@vt.edu> wrote:

> I have a primary -> standby 1 -> standby 2 setup with all servers running
> PG 13.8 (this effort is part of getting on to a newer version, but I thin=
k
> those details aren't relevant to this problem).  The first standby uses
> streaming replication from the primary and the second standby is using a
> WAL archive with a restore_command.  To make this standby chain work,
> standby 1 is set to archive_mode =3D always with a command that populates=
 the
> WAL archive.
>
> I would like to be able to promote standby 2 (hereon referred to just as
> 'standby'), perform some writes, then rewind it back to the point before
> promotion so it can become a standby again.  The documentation for
> pg_rewind says that this is supported and it seems like it should be
> straightforward, but I'm not having any luck getting this to work so I'm
> hoping someone can point out what I'm doing wrong.  Here's what I did:
>
> First, observe that WAL is properly being applied from the archive.  Note
> that we are currently on timeline 1.
>
> 2024-11-06 09:51:23.286 EST [5438] LOG:  restored log file
> "0000000100007711000000F9" from archive
> 2024-11-06 09:51:23.434 EST [5438] LOG:  restored log file
> "0000000100007711000000FA" from archive
> /data/wal_archive/restore_operational.sh: line 2:
> /data/wal_archive/operational/0000000100007711000000FB.gz: No such file o=
r
> directory
> /data/wal_archive/restore_operational.sh: line 2:
> /data/wal_archive/operational/00000002.history.gz: No such file or direct=
ory
>
> Next, stop postgres, set wal_log_hints =3D on as required by pg_rewind, a=
nd
> restart postgres.  I also make a copy of the data directory while the
> postgres is not running so I can repeat my test, which works fine on a
> small test database but won't be possible for the multi TB database that =
I
> will eventually be doing this on.
>
> Now promote the standby using "select pg_promote()" and see that it
> switches to a new timeline.  You can also see that the last WAL applied
> from the archive is 7718/BF.
>
> 2024-11-06 12:10:10.831 EST [4336] LOG:  restored log file
> "0000000100007718000000BD" from archive
> 2024-11-06 12:10:10.996 EST [4336] LOG:  restored log file
> "0000000100007718000000BE" from archive
> /data/wal_archive/restore_operational.sh: line 2:
> /data/wal_archive/operational/0000000100007718000000BF.gz: No such file o=
r
> directory
> /data/wal_archive/restore_operational.sh: line 2:
> /data/wal_archive/operational/00000002.history.gz: No such file or direct=
ory
> 2024-11-06 12:10:15.384 EST [4336] LOG:  restored log file
> "0000000100007718000000BF" from archive
> /data/wal_archive/restore_operational.sh: line 2:
> /data/wal_archive/operational/0000000100007718000000C0.gz: No such file o=
r
> directory
> 2024-11-06 12:10:15.457 EST [4336] LOG:  received promote request
> 2024-11-06 12:10:15.457 EST [4336] LOG:  redo done at 7718/BFFFFF30
> 2024-11-06 12:10:15.457 EST [4336] LOG:  last completed transaction was a=
t
> log time 2024-11-06 12:10:22.627074-05
> 2024-11-06 12:10:15.593 EST [4336] LOG:  restored log file
> "0000000100007718000000BF" from archive
> /data/wal_archive/restore_operational.sh: line 2:
> /data/wal_archive/operational/00000002.history.gz: No such file or direct=
ory
> 2024-11-06 12:10:15.611 EST [4336] LOG:  selected new timeline ID: 2
> 2024-11-06 12:10:15.640 EST [4336] LOG:  archive recovery complete
> /data/wal_archive/restore_operational.sh: line 2:
> /data/wal_archive/operational/00000001.history.gz: No such file or direct=
ory
> 2024-11-06 12:10:17.028 EST [4329] LOG:  database system is ready to
> accept connections
>
> Next, insert a record into just to make some changes that I want to
> rollback later.  (What I will eventually be doing is creating a publicati=
on
> to ship data to a newer version, but again, that's not what's important
> here.)
>
> Finally, shutdown postgres and attempt a rewind.  The address used in the
> --source-server connection string is the address of the primary.
>
> 2024-11-06 12:11:11.139 EST [4329] LOG:  received fast shutdown request
> 2024-11-06 12:11:11.143 EST [4329] LOG:  aborting any active transactions
> 2024-11-06 12:11:11.144 EST [4329] LOG:  background worker "logical
> replication launcher" (PID 5923) exited with exit code 1
> 2024-11-06 12:11:40.933 EST [4342] LOG:  shutting down
> 2024-11-06 12:11:41.753 EST [4329] LOG:  database system is shut down
>
> /usr/pgsql-13/bin/pg_rewind --target-pgdata=3D/data/pgsql/operational
> --source-server=3D"host=3Dx.x.x.x dbname=3Dpostgres user=3Dxxx password=
=3Dxxx"
> --dry-run --progress --restore-target-wal
>
> pg_rewind: connected to server
> pg_rewind: servers diverged at WAL location 7718/BFFFFFE8 on timeline 1
> /data/wal_archive/restore_operational.sh: line 2:
> /data/wal_archive/operational/0000000200007718000000BF.gz: No such file o=
r
> directory
> pg_rewind: error: could not restore file "0000000200007718000000BF" from
> archive
> pg_rewind: fatal: could not find previous WAL record at 7718/BFFFFFE8
>
> pg_rewind shows the point of divergence as 7718/BF on timeline 1, but whe=
n
> it tries to replay WAL using the restore command it is trying to find WAL
> from timeline 2 rather than picking back up on timeline 1.  I tried
> setting recovery_target_timeline on the target database to 'current' and
> '1' but that gave the same result. Searching the archives, [1] mentions t=
he
> need to force a checkpoint after promotion which I tried even though the
> problem description isn't the same.  [2] mentions a problem that looks mo=
re
> like the one I am facing but has no responses.  At this point I don't kno=
w
> what to do next and hope someone can point me in the right direction.
>
> [1]
> https://www.postgresql.org/message-id/e7b16ddea93a92575cb6d143b6ef602cab2=
2432e.camel%40cybertec.at
> [2]
> https://www.postgresql.org/message-id/CALp3DH1fLZmPvkOteAbUo4TOLZP-LstKOs=
6Gcw3Bm7acmJqk=3Dw@mail.gmail.com
>
> Craig
>

--00000000000013939206264f84ef
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Your point of divergence is in the middle of the 7718/0000=
00BF file. So, you should have 2 such files eventually, one on timeline 1 a=
nd the other on timeline 2.<div><br></div><div>Are you archiving WAL on the=
 promoted machine in a way that your restore_command can find it? Check arc=
hive_command and archive_mode on the promoted machine.</div><div><br></div>=
<div>Also, do your archive/restore scripts work properly for history files?=
</div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_=
attr">On Wed, Nov 6, 2024 at 7:48=E2=80=AFPM Craig McIlwee &lt;<a href=3D"m=
ailto:craigm@vt.edu">craigm@vt.edu</a>&gt; wrote:<br></div><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid r=
gb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>I have a primary -&=
gt; standby 1 -&gt; standby 2 setup with all servers running PG 13.8 (this =
effort is part of getting on to a newer version,=C2=A0but I think those det=
ails aren&#39;t relevant to this problem).=C2=A0 The first standby uses str=
eaming replication from the primary and the second standby is using a WAL a=
rchive with a restore_command.=C2=A0 To make this standby chain work, stand=
by 1 is set to archive_mode =3D always with a command that populates the WA=
L archive.</div><div><br></div><div>I would like to be able to promote stan=
dby 2 (hereon referred to just as &#39;standby&#39;), perform some writes, =
then rewind it back to the point before promotion so it can become a standb=
y again.=C2=A0 The documentation for pg_rewind says that this is supported=
=C2=A0and it seems like it should be straightforward, but I&#39;m not havin=
g any luck getting this to work so I&#39;m hoping someone can point out wha=
t I&#39;m doing wrong.=C2=A0 Here&#39;s what I did:</div><div><br></div><di=
v>First, observe that WAL is properly being applied from the archive.=C2=A0=
 Note that we are currently on timeline 1.<br>=C2=A0<br>2024-11-06 09:51:23=
.286 EST [5438] LOG: =C2=A0restored log file &quot;0000000100007711000000F9=
&quot; from archive<br>2024-11-06 09:51:23.434 EST [5438] LOG: =C2=A0restor=
ed log file &quot;0000000100007711000000FA&quot; from archive<br>/data/wal_=
archive/restore_operational.sh: line 2: /data/wal_archive/operational/00000=
00100007711000000FB.gz: No such file or directory<br>/data/wal_archive/rest=
ore_operational.sh: line 2: /data/wal_archive/operational/00000002.history.=
gz: No such file or directory</div><div><br></div><div>Next, stop postgres,=
 set wal_log_hints =3D on as required by pg_rewind, and restart postgres.=
=C2=A0 I also make a copy of the data directory while the postgres is not r=
unning so I can repeat my test, which works fine on a small test database b=
ut won&#39;t be possible for the multi TB database that I will eventually b=
e doing this on.=C2=A0</div><div><br></div><div>Now promote the standby usi=
ng &quot;select pg_promote()&quot; and see that it switches to a new timeli=
ne.=C2=A0 You can also see that the last WAL applied from the archive is 77=
18/BF.</div><div><br></div><div>2024-11-06 12:10:10.831 EST [4336] LOG: =C2=
=A0restored log file &quot;0000000100007718000000BD&quot; from archive<br>2=
024-11-06 12:10:10.996 EST [4336] LOG: =C2=A0restored log file &quot;000000=
0100007718000000BE&quot; from archive<br>/data/wal_archive/restore_operatio=
nal.sh: line 2: /data/wal_archive/operational/0000000100007718000000BF.gz: =
No such file or directory<br>/data/wal_archive/restore_operational.sh: line=
 2: /data/wal_archive/operational/00000002.history.gz: No such file or dire=
ctory<br>2024-11-06 12:10:15.384 EST [4336] LOG: =C2=A0restored log file &q=
uot;0000000100007718000000BF&quot; from archive<br>/data/wal_archive/restor=
e_operational.sh: line 2: /data/wal_archive/operational/0000000100007718000=
000C0.gz: No such file or directory<br>2024-11-06 12:10:15.457 EST [4336] L=
OG: =C2=A0received promote request<br>2024-11-06 12:10:15.457 EST [4336] LO=
G: =C2=A0redo done at 7718/BFFFFF30<br>2024-11-06 12:10:15.457 EST [4336] L=
OG: =C2=A0last completed transaction was at log time 2024-11-06 12:10:22.62=
7074-05<br>2024-11-06 12:10:15.593 EST [4336] LOG: =C2=A0restored log file =
&quot;0000000100007718000000BF&quot; from archive<br>/data/wal_archive/rest=
ore_operational.sh: line 2: /data/wal_archive/operational/00000002.history.=
gz: No such file or directory<br>2024-11-06 12:10:15.611 EST [4336] LOG: =
=C2=A0selected new timeline ID: 2<br>2024-11-06 12:10:15.640 EST [4336] LOG=
: =C2=A0archive recovery complete<br>/data/wal_archive/restore_operational.=
sh: line 2: /data/wal_archive/operational/00000001.history.gz: No such file=
 or directory<br>2024-11-06 12:10:17.028 EST [4329] LOG: =C2=A0database sys=
tem is ready to accept connections</div><div><br></div><div>Next, insert a =
record into just to make some changes that I want to rollback later.=C2=A0 =
(What I will eventually be doing is creating a publication to ship data to =
a newer version, but again, that&#39;s not what&#39;s important here.)</div=
><div><br></div><div>Finally, shutdown postgres and attempt a rewind.=C2=A0=
 The address used in the --source-server connection string is the address o=
f the primary.</div><div><br></div><div>2024-11-06 12:11:11.139 EST [4329] =
LOG: =C2=A0received fast shutdown request<br>2024-11-06 12:11:11.143 EST [4=
329] LOG: =C2=A0aborting any active transactions<br>2024-11-06 12:11:11.144=
 EST [4329] LOG: =C2=A0background worker &quot;logical replication launcher=
&quot; (PID 5923) exited with exit code 1<br>2024-11-06 12:11:40.933 EST [4=
342] LOG: =C2=A0shutting down<br>2024-11-06 12:11:41.753 EST [4329] LOG: =
=C2=A0database system is shut down</div><div><br></div><div>/usr/pgsql-13/b=
in/pg_rewind --target-pgdata=3D/data/pgsql/operational --source-server=3D&q=
uot;host=3Dx.x.x.x dbname=3Dpostgres user=3Dxxx password=3Dxxx&quot; --dry-=
run --progress --restore-target-wal</div><div><br></div><div>pg_rewind: con=
nected to server<br>pg_rewind: servers diverged at WAL location 7718/BFFFFF=
E8 on timeline 1<br>/data/wal_archive/restore_operational.sh: line 2: /data=
/wal_archive/operational/0000000200007718000000BF.gz: No such file or direc=
tory<br>pg_rewind: error: could not restore file &quot;00000002000077180000=
00BF&quot; from archive<br>pg_rewind: fatal: could not find previous WAL re=
cord at 7718/BFFFFFE8</div><div><br></div><div>pg_rewind shows the point of=
 divergence=C2=A0as 7718/BF on timeline 1, but when it tries to replay WAL =
using the restore command it is trying to find WAL from timeline 2 rather t=
han picking=C2=A0back up on timeline 1.=C2=A0 I tried setting=C2=A0recovery=
_target_timeline on the target database to &#39;current&#39; and &#39;1&#39=
; but that gave the same result. Searching the archives, [1] mentions the n=
eed to force a checkpoint after promotion which I tried even though the pro=
blem description isn&#39;t the same.=C2=A0 [2] mentions a problem that look=
s more like the one I am facing but has no responses.=C2=A0 At this point I=
 don&#39;t know what to do next and hope someone can point me in the right =
direction.</div><div><br></div><div>[1]=C2=A0<a href=3D"https://www.postgre=
sql.org/message-id/e7b16ddea93a92575cb6d143b6ef602cab22432e.camel%40cyberte=
c.at" target=3D"_blank">https://www.postgresql.org/message-id/e7b16ddea93a9=
2575cb6d143b6ef602cab22432e.camel%40cybertec.at</a></div><div>[2]=C2=A0<a h=
ref=3D"https://www.postgresql.org/message-id/CALp3DH1fLZmPvkOteAbUo4TOLZP-L=
stKOs6Gcw3Bm7acmJqk=3Dw@mail.gmail.com" target=3D"_blank">https://www.postg=
resql.org/message-id/CALp3DH1fLZmPvkOteAbUo4TOLZP-LstKOs6Gcw3Bm7acmJqk=3Dw@=
mail.gmail.com</a></div><div><br></div><div>Craig</div></div>
</blockquote></div>

--00000000000013939206264f84ef--