public inbox for [email protected]  
help / color / mirror / Atom feed
From: Craig McIlwee <[email protected]>
To: Torsten Förtsch <[email protected]>
Cc: [email protected]
Subject: Re: Trouble using pg_rewind to undo standby promotion
Date: Thu, 7 Nov 2024 08:26:46 -0500
Message-ID: <CAGqBcTZKSYTuVmf6ppR=GKYPtgKKOp6DASaP6YZYUAks49EHoQ@mail.gmail.com> (raw)
In-Reply-To: <CAKkG4_kt9zZNPwB2Qd0-jAtRn3m=JPWW_-hc8VSyhfZ+VjfKQw@mail.gmail.com>
References: <CAGqBcTbWpCuCRuWHd9209k2Q0CHA4j+2MnRR2SwszCXnErG9hw@mail.gmail.com>
	<CAKkG4_kt9zZNPwB2Qd0-jAtRn3m=JPWW_-hc8VSyhfZ+VjfKQw@mail.gmail.com>

On Thu, Nov 7, 2024 at 4:47 AM Torsten Förtsch <[email protected]>
wrote:

> Your point of divergence is in the middle of the 7718/000000BF file. So,
> you should have 2 such files eventually, one on timeline 1 and the other on
> timeline 2.
>
> Are you archiving WAL on the promoted machine in a way that your
> restore_command can find it? Check archive_command and archive_mode on the
> promoted machine.
>

No, the promoted machine is not archiving.  How should that work?  Is it OK
for a log shipping standby that uses restore_command to also push to the
same directory with an archive_command or would that cause issues of trying
to read and write the same file simultaneously during WAL replay?  Or
should I be setting up an archive_command that pushes to a separate
directory and have a restore_command that knows to check both locations?

Hmm, as I write that out, I realize that I could use archive_mode = on
instead of archive_mode = always to avoid the potential for read/write
conflicts during WAL replay.  I can try this later and report back.

Also, do your archive/restore scripts work properly for history files?
>

The scripts don't do anything special with history files.  They are based
on the continuous archive docs [1] and this [2] article the with slight
modification to include a throttled scp since the log shipping server is
located in a different data center from the promoted standby and there is
limited bandwidth between the two.  (Also note that the archive script from
[2] is adapted to properly handle file transfer failures - the one in the
article will use the exit code of the rm command so postgres won't be
informed the file transfer fails resulting in missing WAL in the archive.)

Archive script:
---
#!/bin/bash

# $1 = %p
# $2 = %f

limit=10240 # 10Mbps

gzip < /var/lib/pgsql/13/data/$1 > /tmp/archive/$2.gz

scp -l $limit /tmp/archive/$2.gz [email protected]
:/data/wal_archive/operational/$2.gz
exit_code=$?

rm /tmp/archive/$2.gz

exit $exit_code
---

Restore script:
---
gunzip < /data/wal_archive/operational/$2.gz > $1
---

[1]
https://www.postgresql.org/docs/13/continuous-archiving.html#COMPRESSED-ARCHIVE-LOGS
[2]
https://www.rockdata.net/tutorial/admin-archive-command/#compressing-and-archiving

Craig

>


view thread (3+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Trouble using pg_rewind to undo standby promotion
  In-Reply-To: <CAGqBcTZKSYTuVmf6ppR=GKYPtgKKOp6DASaP6YZYUAks49EHoQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox