public inbox for [email protected]  
help / color / mirror / Atom feed
From: [email protected]
To: Jerry Sievers <[email protected]>
Cc: Álvaro Herrera <[email protected]>
Cc: [email protected]
Subject: Re: In-place upgrade with streaming replicas
Date: Thu, 20 Feb 2025 08:55:31 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>

Dear Jerry,

So, yes it turns out that some kind of loop must be involved here, as 
you described:

1. ensure cluster is running
2. stop primary
3. wait some time
4. stop replicas
5. check if checkpoint locations match. repeat from step 1 if 
out-of-sync.

My question here is, the unreliable step here is 3rd one. Can we query 
the replica runtime if he did catch up? I mean, that after stopping the 
primary, we can obtain the checkpoint location from pg_controldata, 
then, can we somehow query the running replica about that?

Thanks in advance,
Richard

2025-02-20 08:49 időpontban Jerry Sievers ezt írta:
> [email protected] writes:
> 
>> Dear Alvaro,
>> 
>> Thanks for your answers. Unfortunately, I was unaware of a shutdown
>> record, that makes a difference then. So, I definitely must stop the
>> primary first, then use pg_controldata to obtain checkpoint
>> info. Then, can I query the replicas while they are up and running if
>> they've received the shutdown record or not? So, after shutting down
>> the primary, how will I know if a replica has received the mentioned
>> record, and is safe to shutdown?
> 
> 
> 
> Hmmm, not sure about that but what we do, is stop primary, wait a
> $short time, then stop replicas...
> 
> Then run pg_controldata on all nodes | filter out only the line
> indicating latest checkpoint and sort -u the output.  Expect only a
> single line if all are matched.
> 
> You may also wish to first insure that you got the same number of
> lines as total node count before doing the sorting and uniqueing.
> 
> Very rarely on our huge systems, we'd have a mismatch after the
> verification in in those cases, our automated upgrade procedure
> restarts all nodes and then does the shutdown and verify check again.
> 
> HTH
> 
> 
>> 
>> Thanks for the clarifications.
>> 
>> Best regards,
>> Richard
>> 
>> 2025-02-19 16:54 időpontban Álvaro Herrera ezt írta:
>>> On 2025-Feb-19, [email protected] wrote:
>>> 
>>>> With this, I have the question, that after the shutdown of primary,
>>>> what is
>>>> the guarantee for replicas having the same checkpoint location? Why
>>>> does the
>>>> order of shutting down the servers matter? What would be the really
>>>> exact
>>>> and reliable way to ensure that replicas will have the same 
>>>> checkpoint
>>>> location as the primary?
>>> The replicas can't write WAL by themselves, but they will replay
>>> whatever the primary has sent; by shutting down the primary first and
>>> letting the replicas catch up, you ensure that the replicas will
>>> actually receive the shutdown record and replay it.  If you shut down
>>> the replicas first, they can obviously never catch up with the 
>>> shutdown
>>> checkpoint of the primary.
>>> As I recall, if you do shut down the primary first, one potential
>>> danger
>>> is that the primary fails to send the checkpoint record before 
>>> shutting
>>> down, so the replicas won't receive it and obviously will not replay
>>> it;
>>> or simply that they are behind enough that they receive it but don't
>>> replay it.
>>> You could use pg_controldata to read the last checkpoint info from
>>> all
>>> nodes.  You can run it on the primary after shutting it down, and 
>>> then
>>> on each replica while it's still running to ensure that the correct
>>> restartpoint has been created.





reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: In-place upgrade with streaming replicas
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox