Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tkpEV-00771J-RN for pgsql-admin@arkaria.postgresql.org; Wed, 19 Feb 2025 18:55:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tkpEU-00CJ6p-DK for pgsql-admin@arkaria.postgresql.org; Wed, 19 Feb 2025 18:55:42 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tkpET-00CJ5x-Sp for pgsql-admin@lists.postgresql.org; Wed, 19 Feb 2025 18:55:42 +0000 Received: from fw.sz-a.kwebs.cloud ([109.61.102.40]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tkpEQ-001lDR-18 for pgsql-admin@lists.postgresql.org; Wed, 19 Feb 2025 18:55:40 +0000 Received: from webmail.srv.kwebs.cloud (172-16-36-102.prometheus-node-exporter.monitoring.svc.cluster.local [172.16.36.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) (Authenticated sender: richard@kojedz.in) by mx.kwebs.cloud (Postfix) with ESMTPSA id 43F78BE19; Wed, 19 Feb 2025 18:55:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kojedz.in; s=s02; t=1739991333; bh=+D24SgeEWc63rySRj9hwlyDbY4FPJQPfp3iRbDyJSQc=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=QGlprrsAUjcIatCA8yFK+3otPrjW03cMoEfs+OHss5rTjUrW3WNTlekDqngTt0Ir6 NuIqeIhUCXjOa65iLv6x2oYS4/1Cz9nTNWPsGfmLSUf6IisKEDK/7R01B4hVVYOCnc OQ2LHX3gK0pCCGibQaMjVSPT9ybRGcnEfwh7hFNPMip34GXyC1yR7EhbDpgItrHcOK oOhxQOT3NalO4vnTXpJi+c3VtKiiJ+5IyuDkMyGzH+YGiQh2VieErufp/qyv24JHYR 26dFYGoDHDhWQRsZ7EHpq+jUnKnLml4FoX4uK7DAihuz6XcJeEoQ1Qc9bQW0n0hnGh aQgYIr3cHcfRg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kojedz.in; s=s01; t=1739991333; bh=+D24SgeEWc63rySRj9hwlyDbY4FPJQPfp3iRbDyJSQc=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=qHyCaGjzJSRnB5p60KyOklXEGSE2yVQfjPD6fyAtjNa0dL98FvYAw8lPMUk1dGa94 jOslkumqZOchP3YqbRf8dEMt4kYH36MkwQMPtnVin7ycHniOrhBgSN8cclmzicXEM+ 0y/MMDfzU5fm9sLi8wXtZGNdxDrm5zpNG9fAsWwJr/WLobxnnJQ0ON6mG2HZRuhFoG vdxfpXCYYQ5IGlj1zex+XMwDG2c/44otz8WfsrUuZNdZUgAdTl5RIlv1S4aQFaeuTl tXdA87oVXAPebGtAhmHtbEUit0VScluRkM93frEB/IyKSjB/UJp7ceMGbOuHm/tYk5 MjrYp7sovFHnDMnbp+vihaAwf8p979hZl4E+W7K+dG/0JG52ayYMFf4YmnQqzymMfR CkAQxrXOgeUNCn2kN/d8AepohIhvZ2rLgJG9vS3KIYseBXYmkJX38uJ7tkTpOWrv+i 9PuPGUq+4J2/Hc5e/mmZb9jj1RWtTP5lZ+QiPVFmh3AndfUEL98 MIME-Version: 1.0 Date: Wed, 19 Feb 2025 19:55:32 +0100 From: richard@kojedz.in To: =?UTF-8?Q?=C3=81lvaro_Herrera?= Cc: pgsql-admin@lists.postgresql.org Subject: Re: In-place upgrade with streaming replicas In-Reply-To: <202502191554.6asefyczl7jn@alvherre.pgsql> References: <202502191554.6asefyczl7jn@alvherre.pgsql> Message-ID: X-Sender: richard@kojedz.in Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Dear Alvaro, Thanks for your answers. Unfortunately, I was unaware of a shutdown record, that makes a difference then. So, I definitely must stop the primary first, then use pg_controldata to obtain checkpoint info. Then, can I query the replicas while they are up and running if they've received the shutdown record or not? So, after shutting down the primary, how will I know if a replica has received the mentioned record, and is safe to shutdown? Thanks for the clarifications. Best regards, Richard 2025-02-19 16:54 időpontban Álvaro Herrera ezt írta: > On 2025-Feb-19, richard@kojedz.in wrote: > >> With this, I have the question, that after the shutdown of primary, >> what is >> the guarantee for replicas having the same checkpoint location? Why >> does the >> order of shutting down the servers matter? What would be the really >> exact >> and reliable way to ensure that replicas will have the same checkpoint >> location as the primary? > > The replicas can't write WAL by themselves, but they will replay > whatever the primary has sent; by shutting down the primary first and > letting the replicas catch up, you ensure that the replicas will > actually receive the shutdown record and replay it. If you shut down > the replicas first, they can obviously never catch up with the shutdown > checkpoint of the primary. > > As I recall, if you do shut down the primary first, one potential > danger > is that the primary fails to send the checkpoint record before shutting > down, so the replicas won't receive it and obviously will not replay > it; > or simply that they are behind enough that they receive it but don't > replay it. > > You could use pg_controldata to read the last checkpoint info from all > nodes. You can run it on the primary after shutting it down, and then > on each replica while it's still running to ensure that the correct > restartpoint has been created.