public inbox for [email protected]
help / color / mirror / Atom feedFrom: Michael Paquier <[email protected]>
To: Jehan-Guillaume de Rorthais <[email protected]>
Cc: Nikolay Samokhvalov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Subject: Re: Using old master as new replica after clean switchover
Date: Thu, 25 Oct 2018 20:45:57 +0900
Message-ID: <[email protected]> (raw)
In-Reply-To: <20181025111551.620c6460@firost>
References: <CANNMO+KYuH3Gh7BZp=UGXpoos4tBR0AFgoONkqWBrokuJthEug@mail.gmail.com>
<20181025111551.620c6460@firost>
On Thu, Oct 25, 2018 at 11:15:51AM +0200, Jehan-Guillaume de Rorthais wrote:
> On Thu, 25 Oct 2018 02:57:18 -0400
> Nikolay Samokhvalov <[email protected]> wrote:
>> My research shows that some people already rely on the following when
>> planned failover (aka switchover) procedure, doing it in production:
>>
>> 1) shutdown the current master
>> 2) ensure that the "master candidate" replica has received all WAL data
>> including shutdown checkpoint from the old master
>> 3) promote the master candidate to make it new master
>> 4) configure recovery.conf on the old master node, while it's inactive
>> 5) start the old master node as a new replica following the new master.
>
> Indeed.
The important point here is that the primary will wait for the shutdown
checkpoint record to be replayed on the standbys before finishing to
shut down.
> The only additional nice step would be to be able to run some more safety tests
> AFTER the switchover process on te old master. The only way I can think of
> would be to run pg_rewind even if it doesn't do much.
Do you have something specific in mind here? I am curious if you're
thinking about things like page-level checks for LSN matches under some
threshold or such, because you should not have pages on the previous
primary which have LSNs newer than the point up to which the standby has
replayed.
>> if so, let's add it to the documentation, making it official. The patch is
>> attached.
>
> I suppose we should add the technical steps in a sample procedure?
If an addition to the docs is done, symbolizing the steps in a list
would be cleaner, with perhaps something in a dedicated section or a new
sub-section. The failover flow you are mentioning is good practice
because that's safe, and there is always room for improvements in the
docs.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
view thread (5+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Using old master as new replica after clean switchover
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox