Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vnnQk-00BqXo-1x for pgsql-hackers@arkaria.postgresql.org; Thu, 05 Feb 2026 00:41:10 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vnnQj-00EIm1-2J for pgsql-hackers@arkaria.postgresql.org; Thu, 05 Feb 2026 00:41:09 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vnnQj-00EIlq-1G for pgsql-hackers@lists.postgresql.org; Thu, 05 Feb 2026 00:41:09 +0000 Received: from fhigh-b2-smtp.messagingengine.com ([202.12.124.153]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vnnQg-000000016x0-3eRd for pgsql-hackers@postgresql.org; Thu, 05 Feb 2026 00:41:08 +0000 Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfhigh.stl.internal (Postfix) with ESMTP id B68627A014A; Wed, 4 Feb 2026 19:41:04 -0500 (EST) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-05.internal (MEProxy); Wed, 04 Feb 2026 19:41:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paquier.xyz; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1770252064; x=1770338464; bh=9AVzE6t/OU PggaJ+Ag5oyW8AMyy5oJBq/ijHir+xT04=; b=G/ODLFkfsJCOY6Zom7MNW9Rqba h7u1haBOMPjQTBSNdnYDS7ERZBJtRwaImh08Ap3plPiTZqRDOz/5oZov59kbdTgQ h7XeVeAv5817bknfz4KXcBwtqstv3EhIwt48e8jJgJbWLl76LqJWpvLjwYkZ6GqC acq5TcBUi9lN0Hp4cscWHiuqso7YpX1y5mb7YgAWjuNG6I1MxKwQPCimfYX7Ry6l sbtxdhaBDQBN3DCyYD/7yNAVwrOnecb4YaVBoM4MJ+Gqt6B/9IrJshn/qD9J8gt5 JlKo4EETGfTjOTF4vUpuQzJkvjR/NT8fr2mPFTi7SE2PlAXLGtMbyZnTLIVw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1770252064; x=1770338464; bh=9AVzE6t/OUPggaJ+Ag5oyW8AMyy5oJBq/ij Hir+xT04=; b=gz7+inOMdUwesCEJpICfh/AF5+zij1m/ctqWUOHvlzUqDV2ma5f 5S0nO/MlC6KIKz21spSs23VixUGQdECo/mApClSOn9boemFh+jLXhxnBRiRH2cFz qe4oJL6a+XefC+h7BcQ3vUvIYCWp4PyeEnGqhH7mUczQJaxFY+1nl7EcmVZCkNS+ H6Q/LstHiqnvtcM8+dEuVefdpki1q0NZFZ1pDcRgMrnzbLIPLLCK6nMCZaRt/iC6 F7/XLK9T0ovTKiuqow93InDereH7iCnsH14MNyIoyJUrHIU9ApGqmFYhMrXBunTl +BVUDGQqRCjLYwv/g4BbuTPRUbVXSe6JG3w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddukeefledtucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnegfrh hlucfvnfffucdljedtmdenucfjughrpeffhffvvefukfhfgggtuggjsehgtdorredttdej necuhfhrohhmpefoihgthhgrvghlucfrrghquhhivghruceomhhitghhrggvlhesphgrqh huihgvrhdrgiihiieqnecuggftrfgrthhtvghrnhepgeffffduieevteeukeffieduteff udefgfdujeejieffgffgledujeeitdevffdtnecuvehluhhsthgvrhfuihiivgeptdenuc frrghrrghmpehmrghilhhfrhhomhepmhhitghhrggvlhesphgrqhhuihgvrhdrgiihiidp nhgspghrtghpthhtohepvddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepnhhith hinhhjrgguhhgrvhhpohhsthhgrhgvshesghhmrghilhdrtghomhdprhgtphhtthhopehp ghhsqhhlqdhhrggtkhgvrhhssehpohhsthhgrhgvshhqlhdrohhrgh X-ME-Proxy: Feedback-ID: i0fe9450f:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 4 Feb 2026 19:41:03 -0500 (EST) Date: Thu, 5 Feb 2026 09:40:58 +0900 From: Michael Paquier To: Nitin Jadhav Cc: Pg Hackers Subject: Re: Change =?utf-8?B?Y2hlY2twb2ludOKAkXJl?= =?utf-8?Q?cord=E2=80=91missing?= PANIC to FATAL Message-ID: References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="OhPlcQwJAAszvatW" Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --OhPlcQwJAAszvatW Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Dec 29, 2025 at 08:39:08PM +0530, Nitin Jadhav wrote: > Apologies for the delay. > At a high level, the recovery startup cases we want to test fall into > two main buckets: > (1) with a backup_label file and (2) without a backup_label file. For clarity's sake, we are talking about lowering this one in xlogrecovery.c, which relates to the code path where these is no backup_label file: ereport(PANIC, errmsg("could not locate a valid checkpoint record at %X/%08X", LSN_FORMAT_ARGS(CheckPointLoc))); > From these two situations, we can cover the following scenarios: > 1) Primary crash recovery without a backup_label =E2=80=93 Delete the WAL > segment containing the checkpoint record and try starting the server. Yeah, let's add a test for that. It would be enough to remove the segment that includes the checkpoint record. There should be no need to be fancy with injection points like the other test case from 15f68cebdcec. > 2) Primary crash recovery with a backup_label =E2=80=93 Take a base backup > (which creates the backup_label), remove the checkpoint WAL segment, > and start the server with that backup directory. Okay. I don't mind something here, for the two FATAL cases in the code path where the backup_label exists: - REDO record missing with checkpoint record found. This is similar to 15f68cebdcec. - Checkpoint record missing. Both should be cross-checked with the FATAL errors generated in the server logs. > 3) Standby crash recovery =E2=80=93 Stop the standby, delete the checkpoi= nt > WAL segment, and start it again to see how standby recovery behaves. In this case, we need to have a restore_command set anyway, no, meaning that we should never fail? I don't recall that we have a test for that, currently, where we could look at the server logs to check that a segment has been retrieved because the segment that includes the checkpoint record is missing.. > 4) PITR / archive=E2=80=91recovery =E2=80=93 Remove the checkpoint WAL se= gment and > start the server with a valid restore_command so it enters archive > recovery. Same as 3) to me, standby mode cannot be activated without a restore_command and the recovery GUC checks are done in accordance to the signal files before we attempt to read the initial checkpoint record. > Tests (2) and (4) are fairly similar, so we can merge them if they > turn out to be redundant. > These are the scenarios I have in mind so far. Please let me know if > you think anything else should be added. For the sake of the change from the PANIC to FATAL mentioned at the top of this message, (1) would be enough. The two cases of (2) I'm mentioning would be nice bonuses. I would recommend to double-check first if we trigger these errors in some tests of the existing tests, actually, perhaps we don't need to add anything except a check in some node's logs for the error string patterns wanted. -- Michael --OhPlcQwJAAszvatW Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEG72nH6vTowiyblFKnvQgOdbyQH0FAmmD5xoACgkQnvQgOdby QH1CsQ//QqmCBfaHFNQWsQZyxmLuOnzU9k1CEDgsrcckKqfyOEeOZljA6fBve4fj vRLdpj0MChS1T3/Mslgxw5MrINFEt9Ma2kRJM7TzCjBwo+Rd9brGTYiRUGJfud3I 01Sz8+quvZwY2V2i1xtdVP1/IBeT4k/MF2IXQTSbYBkNjmSRW4j8e7mGvLVKrkjO SevuvhXn5hyYHggS3LjU11Wp9miItZMqmIYiSGxQCKNLgY81ovUcF+NjNmlneobR /TJZEEm3143Un8WqJi21TTB4NQ+nvze4iWFp3EJ0U9lZd8bELRXeKo94adRbrC+/ R7MiBsOveUuAA0ZtRBdq4z8yS/Bld+MOLIWAPvw9OCon7w5lSyvVM/Q1gfA5ZHfb nax6KZ+NAbKoiDLPd0YWRgV4jWXSTHg2kzvG8K3JjYq7cMzBetZ43M0tvlBaAnFI B6IPXII/z3ORDCFOwWAFpnitiRosi0pL2VMVJnSss+IxIa+2MVcIR6Anp9zt5vKk I3mI++CeMb/n7BsV018bCGAMKUztSF8wwnaXLeiC6AFbACY5mE44l53FRNk476mD DXS/T0kRs23vHGAhPBYfb7XfhbE8d976q4IOQzyYvQYgr6abgP3//26yXF/+aedx EqF31FSwzLbG++jDlPqC4yh6hGa06T/2FGjVkLKXFr/fOpxghjo= =w4RO -----END PGP SIGNATURE----- --OhPlcQwJAAszvatW--