MIME-Version: 1.0
Date: Wed, 19 Feb 2025 19:55:32 +0100
From: richard@kojedz.in
To: =?UTF-8?Q?=C3=81lvaro_Herrera?= <alvherre@alvh.no-ip.org>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: In-place upgrade with streaming replicas
In-Reply-To: <202502191554.6asefyczl7jn@alvherre.pgsql>
References: <202502191554.6asefyczl7jn@alvherre.pgsql>
Message-ID: <d438397ad2c67e0ce683bc3158746691@kojedz.in>
Content-Type: text/plain; charset=UTF-8;
 format=flowed
Content-Transfer-Encoding: 8bit
Archived-At: 
 <https://www.postgresql.org/message-id/d438397ad2c67e0ce683bc3158746691%40kojedz.in>
Precedence: bulk

Dear Alvaro,

Thanks for your answers. Unfortunately, I was unaware of a shutdown 
record, that makes a difference then. So, I definitely must stop the 
primary first, then use pg_controldata to obtain checkpoint info. Then, 
can I query the replicas while they are up and running if they've 
received the shutdown record or not? So, after shutting down the 
primary, how will I know if a replica has received the mentioned record, 
and is safe to shutdown?

Thanks for the clarifications.

Best regards,
Richard

2025-02-19 16:54 időpontban Álvaro Herrera ezt írta:
> On 2025-Feb-19, richard@kojedz.in wrote:
> 
>> With this, I have the question, that after the shutdown of primary, 
>> what is
>> the guarantee for replicas having the same checkpoint location? Why 
>> does the
>> order of shutting down the servers matter? What would be the really 
>> exact
>> and reliable way to ensure that replicas will have the same checkpoint
>> location as the primary?
> 
> The replicas can't write WAL by themselves, but they will replay
> whatever the primary has sent; by shutting down the primary first and
> letting the replicas catch up, you ensure that the replicas will
> actually receive the shutdown record and replay it.  If you shut down
> the replicas first, they can obviously never catch up with the shutdown
> checkpoint of the primary.
> 
> As I recall, if you do shut down the primary first, one potential 
> danger
> is that the primary fails to send the checkpoint record before shutting
> down, so the replicas won't receive it and obviously will not replay 
> it;
> or simply that they are behind enough that they receive it but don't
> replay it.
> 
> You could use pg_controldata to read the last checkpoint info from all
> nodes.  You can run it on the primary after shutting it down, and then
> on each replica while it's still running to ensure that the correct
> restartpoint has been created.