Re: Change checkpoint‑record‑missing PANIC to FATAL

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Michael Paquier <[email protected]>
To: Nitin Jadhav <[email protected]>
Cc: Pg Hackers <[email protected]>
Subject: Re: Change checkpoint‑record‑missing PANIC to FATAL
Date: Tue, 10 Mar 2026 12:58:36 +0900
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAMm1aWZbQ-Acp_xAxC7mX9uZZMH8+NpfepY9w=AOxbBVT9E=uA@mail.gmail.com>
References: <CAMm1aWZ9Tv=Wrx52_2Ppw+6ULf_twRZuQm=ZWLA_a-kXWykHkQ@mail.gmail.com>
	<[email protected]>
	<CAMm1aWb47v9Bx40P1_6YpRxxKi9XSwjAV_bLbFxx66Rg8o3+=g@mail.gmail.com>
	<[email protected]>
	<CAMm1aWZQfMm1Wn=cMorqNsqdySERZCgV2JL87jVy4yCmLY8xvg@mail.gmail.com>
	<[email protected]>
	<CAMm1aWZbQ-Acp_xAxC7mX9uZZMH8+NpfepY9w=AOxbBVT9E=uA@mail.gmail.com>

On Fri, Mar 06, 2026 at 11:02:41AM +0530, Nitin Jadhav wrote:
> Patch 0001 adjusts the error severity during crash recovery when the
> checkpoint record referenced by pg_control cannot be located and no
> backup_label file is present. The error is lowered from PANIC to
> FATAL. This patch also adds a new TAP test that verifies startup fails
> with a clear FATAL error. The test is straightforward: it removes the
> WAL segment containing the checkpoint record and confirms that the
> server reports the expected error.

There could be one possibility for 0001 to be unstable, actually.  One
could imagine that by getting the segment from a live server things
could fail: if the run is very slow then we may finish with an
incorrect record.  It would be possibe to use pg_controldata once the
server has been shut down, but we should be on the same segment anyway
because nothing happens on the server, so I have let the test are you
have suggested, and applied this one.  You have missed an update in
meson.build, and I am not sure that there was a need for renaming 050,
either.

> Missing checkpoint WAL segment referenced by backup_label:
> This test uses an online backup to create a backup_label file,
> extracts the checkpoint record information from it, removes the
> corresponding WAL segment, and verifies that the server reports the
> expected error.
> 
> Missing redo WAL segment referenced by the checkpoint:
> In this test, redo and checkpoint records are forced into different
> WAL segments using injection points. A cold backup is then taken, with
> an explicit backup_label created in the restored cluster. The WAL
> segment containing the redo record is removed, and startup is expected
> to fail with the appropriate error message.

These two are very close to the existing tests that we have on HEAD
now, could it be better to group them together instead?  Particularly,
054 reuses the injection point trick in what is clearly a copy-paste
taken from 050.  Avoiding this duplication may be nice.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

view thread (10+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Change checkpoint‑record‑missing PANIC to FATAL
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox