Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2fzr-000Uyu-0q for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Mar 2026 01:47:05 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w2fzq-006VZI-0n for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Mar 2026 01:46:54 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2fwl-006Rm5-1V for pgsql-hackers@lists.postgresql.org; Wed, 18 Mar 2026 01:43:44 +0000 Received: from fhigh-b5-smtp.messagingengine.com ([202.12.124.156]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w2fwi-00000000nVD-01gd for pgsql-hackers@postgresql.org; Wed, 18 Mar 2026 01:43:43 +0000 Received: from phl-compute-08.internal (phl-compute-08.internal [10.202.2.48]) by mailfhigh.stl.internal (Postfix) with ESMTP id 454057A01B1; Tue, 17 Mar 2026 21:43:37 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-08.internal (MEProxy); Tue, 17 Mar 2026 21:43:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paquier.xyz; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1773798217; x=1773884617; bh=SCB5b0ZyoE eK/ceVR7NakRoqStLOvE++NDTavVCFPxc=; b=wpsByaHcmcvO6CwPkSDD6k7McF osLR4YAFRJJIFrIgF4/rOWiu86NB0e951g1Ye0RHh+MxQVkjOTASL63eNjfCPp+e uKBgd1cIZ9qymSWQgrYi2Hnw06v92e/T1rZmFcSVTfGbn2dzTctYN4JPFssct6kE iyLPKS2X9F/PN50Oovas80YXo+IcI4Zu1oPsIVCb4HLglGnJ5Z2nSngR4Tjb1gvr SZrnxoZkL6p26ywAVckdvqrgJ+xIR95kp7tgk3SLLxMg8gaon9p9cuNH3dvSgIYk LW953sF4SWx4oawcNNG8X64T//wSdf9B4+YF03iq5nKrAYZQ9v0X2REPBXfA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1773798217; x=1773884617; bh=SCB5b0ZyoEeK/ceVR7NakRoqStLOvE++NDT avVCFPxc=; b=PGm8efmoUj0Vb32sfbAUtPxfqmGsPfyFj9KRUHev7NBoXO05J2x PI3DrhVRBHkeDp1hJ7/4/2Lb8DmMW6kkvk2tze4+F8JWC2tWSZGH4+NEzBNwkuAw oNB7oPuA1dG2/bvX3sQFTSfABc19xp5k3mCBoBqJw0l5ilWCLHw/pLsvWLHo8Fwm eg2fehlhF9ZKsbRPwDvjSINmop9PUzhaX/UrXvFJz1igc0FIOHG3CJaKRDK7ezAJ LjiKbzq0ZPHTfLuN+GOloJCwFUaYguJ9VhHT2GR/f2IjnuFUFQpfYFiA024uXpH/ aQFAAy1cWe8Kg2lnHcdCuUv357TCNZIkh4Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdeftddvkeegucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnegfrh hlucfvnfffucdljedtmdenucfjughrpeffhffvvefukfhfgggtuggjsehgtdorredttddv necuhfhrohhmpefoihgthhgrvghlucfrrghquhhivghruceomhhitghhrggvlhesphgrqh huihgvrhdrgiihiieqnecuggftrfgrthhtvghrnhepffduleehgefhkeefgfeftdfhkedu tddvhfdukeeggeehieekffelhfelkeehteefnecuvehluhhsthgvrhfuihiivgeptdenuc frrghrrghmpehmrghilhhfrhhomhepmhhitghhrggvlhesphgrqhhuihgvrhdrgiihiidp nhgspghrtghpthhtohepjedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepthhrih hsthgrnhdrhihimhesghhmrghilhdrtghomhdprhgtphhtthhopegurghvihgusehpghgs rggtkhhrvghsthdrohhrghdprhgtphhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehpoh hsthhgrhgvshhqlhdrohhrghdprhgtphhtthhopehhlhhinhhnrghkrgesihhkihdrfhhi pdhrtghpthhtoheprhhosggvrhhtmhhhrggrshesghhmrghilhdrtghomhdprhgtphhtth hopegrnhgurhgvshesrghnrghrrgiivghlrdguvgdprhgtphhtthhopehmrghsrghordhf uhhjihhisehgmhgrihhlrdgtohhm X-ME-Proxy: Feedback-ID: i0fe9450f:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 17 Mar 2026 21:43:34 -0400 (EDT) Date: Wed, 18 Mar 2026 10:43:29 +0900 From: Michael Paquier To: Haibo Yan Cc: David Steele , Pg Hackers , Heikki Linnakangas , Robert Haas , Andres Freund , Fujii Masao Subject: Re: Return pg_control from pg_backup_stop(). Message-ID: References: <1248c005-b693-494a-8d7a-68bc7d482679@pgbackrest.org> <8b8aa673-fcef-4e14-a05d-0885283ef1b8@pgbackrest.org> <17DC1346-0CDE-4E39-B110-3D6FB0797AC6@gmail.com> <7F7B289B-F94F-42C2-9E54-6A689C0D64BB@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="ZW23AjjBRkQnF7f4" Content-Disposition: inline In-Reply-To: <7F7B289B-F94F-42C2-9E54-6A689C0D64BB@gmail.com> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --ZW23AjjBRkQnF7f4 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Mar 17, 2026 at 11:50:29AM -0700, Haibo Yan wrote: > Thank you for the clarification. I have now read the code, and > overall it looks good to me. I only had one very small comment. (Bottom-posting note from above, please be careful.) > I was just wondering whether the following might be slightly clearer: > ``` > memset(controlFile, 0, PG_CONTROL_FILE_SIZE); > memcpy(controlFile, ControlFile, sizeof(ControlFileData)); > ``` >=20 > I do not think this is a real issue, though. { ControlFile->backupStartPoint =3D checkPoint.redo; ControlFile->backupEndRequired =3D backupEndRequired; + ControlFile->backupLabelRequired =3D false; It sounds like it is going to be important to document the reason why the flag is reset here (aka we don't need the backup_label file anymore). +backup_control_file(uint8_t *controlFile) +{ + ControlFileData *controlData =3D ((ControlFileData *)controlFile); + + memset(controlFile + sizeof(ControlFileData), 0, + PG_CONTROL_FILE_SIZE - sizeof(ControlFileData)); + + LWLockAcquire(ControlFileLock, LW_SHARED); + memcpy(controlFile, ControlFile, sizeof(ControlFileData)); + LWLockRelease(ControlFileLock); + + controlData->backupLabelRequired =3D true; + + INIT_CRC32C(controlData->crc); + COMP_CRC32C(controlData->crc, controlFile, offsetof(ControlFileData, c= rc)); + FIN_CRC32C(controlData->crc); I was wondering if we should have an assertion at least to cross-check that the contents we store in shared memory never go out-of-sync with the on-disk contents, in the shape of a USE_ASSERT_CHECKING block that calls get_controlfile() and memcmp()'s the contents between shmem and the on-disk file, while the LWLock is taken. We ship the control file last on purpose, one reason being backups taken from standbys, so that may be sensible to do. Another property of the new control file flag that is implied in the implementation but not documented is that we should never check for backupLabelRequired when a backup_label is gone. Actually, the flag is reset in InitWalRecovery() in the initial steps of recovery, and the backup_label file is removed much later in StartupXLOG() just *after* UpdateControlFile() to minimize the window where the contents of the control file and the backup_label file is removed are out-of-sync. This window means that if we crash between the completion of UpdateControlFile() and the durable_rename() we could have a flag reset with a backup_label still around. On restart, recovery would fail, requiring a manual modification of the control file, at least. It sounds to me that this implementation detail should be documented clearly. Finally, here is a general opinion. I like this patch, and it is basically risk-free for base backups taken with the replication protocol as we update the control file with the new flag set on-the-fly.=20 Now, I am worried about backups that use pg_stop_backup(). Changing backup APIs has always been a very sensitive area, and this is going to require operators to update backup tools so as the control file received as a result of pg_stop_backup() is copied, at the cost of getting a failure if they don't do so. I will *not* proceed with this change without a clear approval from some more committers or senior hackers that they like this change (approach previously suggested by Andres, actually, for what I can see). I am adding in CC a few committers who have commented on this set of proposals and who have touched the recovery code in the last few years, for awareness. The timing is what it is, and we are at the end of a release cycle. Let's see if we can reach a consensus of some kind. -- Michael --ZW23AjjBRkQnF7f4 Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEG72nH6vTowiyblFKnvQgOdbyQH0FAmm6A0EACgkQnvQgOdby QH2bGA//ctPKCNU/9CWuz/GUhwYavjHaGhhMp9+TpdCvlhODUieTNy810sDWc40s UCmF8TxrIxSQPoPyQ2kE7hCHYH4hXO1IeMCqc/83MxU9VQnnGMzhmue+MFpsfBIp xOuMdSTwYl6mJ73oDe6Lo2IBbq77yPA9QwDmdTGKOkj6pU/kiI0pRlfaaaAiC0rV 5sFXgea0Vz4nbqOLKXBqs/uhaRo0i1qWWdtsxsl+hDTRPze6pFZffD9Ntwp0m4Fi TbezTInfbowoZebMJFkw5JxY3AMeQu219BR79EZNBJ6nrKw/EfvV2VlEB4QNPYxD XnUPvWaHBClfdF3mP8oyEdBqglLz6wwMdGfc7UpYDWJxtAH9p71MEQOOeCs4jw5I 8xdd5FUdty52sOeMrCW1pMyXgheYGlnroM7PkxoBHN3KLQ0VM/tV2gugOgwkUqGN S8gJVKB7Trx0fsX0LQ63iZl3bdzQRzRxQWYUNwnfRQtpY2iagIEUr/e5hY7XEF9y n0Hrjeq2uXsIikcqW/syvRBsvJS+aNXu5HOCzf3uL3ZAe+JAoFBY+ZVj+MuArbK/ P8kRKGlNnigd6r9p5H4z4DPkFX+LU+LyDATM5W8B3YqiEa1KaRyIimPh3QxHbsf+ UuVNU6OuBLMmD1oF8DWtovK6kCGw97iwbH3+95jr8cJBbDN+CEM= =rsfG -----END PGP SIGNATURE----- --ZW23AjjBRkQnF7f4--