Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1swDjB-002VAY-PA for pgsql-hackers@arkaria.postgresql.org; Thu, 03 Oct 2024 04:46:14 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1swDjA-008uEd-KG for pgsql-hackers@arkaria.postgresql.org; Thu, 03 Oct 2024 04:46:12 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1swDj9-008uAi-6I for pgsql-hackers@lists.postgresql.org; Thu, 03 Oct 2024 04:46:12 +0000 Received: from fhigh-a2-smtp.messagingengine.com ([103.168.172.153]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1swDj2-002IMZ-3S for pgsql-hackers@postgresql.org; Thu, 03 Oct 2024 04:46:10 +0000 Received: from phl-compute-07.internal (phl-compute-07.phl.internal [10.202.2.47]) by mailfhigh.phl.internal (Postfix) with ESMTP id 73B2C11402B2; Thu, 3 Oct 2024 00:46:03 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-07.internal (MEProxy); Thu, 03 Oct 2024 00:46:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paquier.xyz; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1727930763; x=1728017163; bh=p244jDVNRx ouwWwgSvw6X2zaRBSZqgtTP7MnnPql+rI=; b=zKWzF/zo5c7PTLOumGckJbqzCw ictcAYvuMlzvWkDkyHoGVERUfeEDQNY8a3Ug4Ku/7hwCaT/xUpXaHNpAv6r+yWVj W6/E82GiOSRdm5NhS6ZW6k+SBb/mqgKL5XDNFlvxLRuky/D2KaawZA0Ra/TdS0Nu kIo6rUxDEhK2Ov7Mlk0rLyWs5EnNBxve7T0XzBUHHiwUL4pfD4woORjFJom4Rs53 cJAngPIpVTp7AXZiVi9xJoODVKMxUfdjOY2mumLSpSEwsiJgxqaYVc/zfDeFo5eO oftM8D6CmHAYtUlDty89rewepYnNPMuJQ65d08X6CLg+ZuN2Yi2X7UMesYxA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1727930763; x=1728017163; bh=p244jDVNRxouwWwgSvw6X2zaRBSZ qgtTP7MnnPql+rI=; b=IMz2vwJslBP9RyCmk5thyt7jxDiPQpuk9ZRAqu3VO+ku 3yTmTckGGq+SFnmGtcipAiMlARU5x4NOwjWxGY6HeQbqWZBheIRo0Amzory2BJhB IDb+QyT+vA/msaFgBXvQr+TUpV1S44Jy8jVADE9vh9G7oJBcTGQ6hCfRjm/7a83l Kx5s+TAeaF+UGKSE2HaYoM+rk/4vUhaGhecSnavZutbzh1yaAawaZ6Y6k2SohpzH feRINX29dmiRe8K9GmejyjhXggvLweArjYehQvxQ32151pQBd/Ft7dpmu1xuiZTx LL7MNRW3K3z7EktVfh3nS3j5VpxjRLpvnxPOJbIHrQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvtddgkeelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnegfrhhlucfvnfffucdljedtmdenucfjughrpeffhffvvefukfhf gggtuggjsehgtderredttddvnecuhfhrohhmpefoihgthhgrvghlucfrrghquhhivghruc eomhhitghhrggvlhesphgrqhhuihgvrhdrgiihiieqnecuggftrfgrthhtvghrnhepteel ieefudffhffhtdetleeggeegfffhkeeuveetiefgudduvedutefggeeivdejnecuvehluh hsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhhitghhrggvlhes phgrqhhuihgvrhdrgiihiidpnhgspghrtghpthhtohepvddpmhhouggvpehsmhhtphhouh htpdhrtghpthhtohepuggrvhhiugesphhgsggrtghkrhgvshhtrdhorhhgpdhrtghpthht ohepphhgshhqlhdqhhgrtghkvghrshesphhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: i0fe9450f:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 3 Oct 2024 00:46:01 -0400 (EDT) Date: Thu, 3 Oct 2024 13:45:43 +0900 From: Michael Paquier To: David Steele Cc: Pg Hackers Subject: Re: Return pg_control from pg_backup_stop(). Message-ID: References: <83f5295e-082d-432c-92a4-365707bd2296@pgbackrest.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="aTlC9NYQIPYa+4ax" Content-Disposition: inline In-Reply-To: <83f5295e-082d-432c-92a4-365707bd2296@pgbackrest.org> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --aTlC9NYQIPYa+4ax Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 02, 2024 at 09:03:27AM +0000, David Steele wrote: > On 10/2/24 10:11, Michael Paquier wrote: >> Hmm, okay. There is also a slight impact for BASE_BACKUP, requiring >> basebackup_progress_wait_wal_archive() and do_pg_backup_stop() to be >> called earlier when sending the main data directory which is the last >> one in the list of tablespaces. As far as I can see, this does not >> change the logic because do_pg_backup_stop() does not touch the >> control file, but it seems to me that we'd rather keep these two >> calls as they are now, and send the control file once we are out of >> the for loop that processes all the tablespaces. That seems slightly >> cleaner to me, still I agree that both are the same things. >=20 > Sending pg_control later results in even more code churn because of how t= ar > files are terminated. I've updated it that way in v2 so you can see what I > mean. I don't have a strong preference, though, so if you prefer the > implementation in v2 then that's fine with me. It does not make much of a difference, indeed. >> Anyway, finishing do_pg_backup_stop() and then sending the control >> file is just a consequence of the implementation choice driven by the >> output required for the SQL function so as this is stored in the >> backup state to get it down to pg_backup_stop() in xlogfuncs.c, so we >> could also take one step back and forget about the SQL function, >> setting only the new flag when sending a BASE_BACKUP, or just not use >> the backupState to store this data as the exercise involves forcing >> one boolean and recalculate a CRC32. >=20 > I can definitely see us making other updates to pg_control so I would rat= her > keep this logic centralized, even though it is not too complicated at this > point. Still, even 8 lines of code (as it is now) seems better not to > duplicate. I was wondering if the field update should be hidden behind a macro that uses an offsetof() on ControlFileData, with the name of the field and a pointer to the value to update to. If you include the CRC32 calculation in that, that makes for less chunks of code when updating one field of the control file. The full CRC calculation could also be hidden inside a macro, as there are a couple of places where we do the same things, like pg_rewind.c, etc. >> We're talking about a 8kB file which has a size of 512B >> (PG_CONTROL_MAX_SAFE_SIZE) to avoid such issues. So I'm not sure to >> see your point here? >=20 > Even at 512B it is possible to see tears in pg_control and they happen in > the build farm right now. In fact, this thread [1] trying to fix the prob= lem > was what got me thinking about alternate solutions to preventing tears in > pg_control. Thomas' proposed fixes have not been committed to my knowledge > so the problem remains, but would be fixed by this commit. Ah, right. That rings a bell. Thomas has done some work with c558e6fd92ff and 63a582222c6b. And we're still not taking the ControlFileLock while copying it over.. It looks like we should do it separately, and backpatch. That's not something for this thread to worry about. >> Perhaps existing >> backup solutions are good enough risk vs reward is not worth it? >=20 > I'm not sure I see the risk here. Saving out pg_control is optional so no > changes to current software is required. Of course they miss the benefit = of > the protection against tears and missing backup_label, but that is a choi= ce. > > Again, optional, but if I was able to manage these saves using the psql > interface in the TAP tests then I'd say it would be pretty easy for anyone > with a normal connection to Postgres. Also, we require users to treat > tabelspace_map and backup_label as binary so not too big a change here. Maintenance cost for a limited user impact overall. With incremental backups being a thing in v18 only available through the replication protocol, the SQL functions have less advantages these days. My point would be to see this thread as a two-step process: 1) Update the flag in the control file when sending it across in replication stream. 2) Do the SQL function thing with the bytea for the control file, if necessary. 1) is something that has more value than 2), IMO, because there is no need for a manual step when a backup is taken by the replication protocol. Well, custom backup solutions that rely on the replication protocol to copy the data would need to make sure that they have a backup_label, but that's something they should do anyway and what this patch wants to protect users from. The SQL part is optional IMO. It can be done, but it has less impact overall and makes backups more complicated by requiring the manual copy of the control file.=20 -- Michael --aTlC9NYQIPYa+4ax Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEG72nH6vTowiyblFKnvQgOdbyQH0FAmb+IXcACgkQnvQgOdby QH3KyA/7B3PNgIgksJI6AV9tJsqXeBfvG4LtRWblGgINp+We+f2IgFTXg6eYz1kE sx+vkEEcO++hnAoLk20gdMnmmBL3H01tCT7WwNRspfTUvok97813qj0ZMQ3rmhnP VnGTaX5uynAG7Fs9JVxa1j/n0gSH5K0xt2kIH2062eWZlmw/KhWY6nmAgxgm55zU jNv/X7vx7u2zVnAXygfYUKzSIgByaNgE4Kq5TXSrYKYYgdjWBy1G11MawneTKxWt KkqewgPeYJwDvidWL9dmgp5F+F8acbpHnLZ5zDX19d93LCwleDtEoDZWUAmnOr04 ps+G6WvQRAoVajB30BB0HcUb8NgOtGO+uX7ULdhpfnlim34UKRvmAb73GOSqUvyv ZAqOEvaEI0dwyFyYlunethR6TqbjoH3HY8Y7eAsaUFr78PDnOq0qu2+BUHRtL9PX 3U0UI0hKhS10mR1ZLg7Roi4vRAOPPnVy7Afx3VnP7ZCDpGB2giO8gYhdsdhipWY/ Mn40reHc/W6qzQ8Hg+Y5MowFnBonO5tHkOruj3VsTDsZpsSLQOd9lOOdI40pTNob I0+o2PNsC9dqRLFYeynlyuhTLDftsZlvxYJbmOtV+k6J73t8r2MotPIJ7i78nsx4 X3wWwUiQ6VnA6wAUdBv51eN54J+tFVK7/Imj18V7g7PgU7ngw3o= =Jurf -----END PGP SIGNATURE----- --aTlC9NYQIPYa+4ax--