public inbox for [email protected]  
help / color / mirror / Atom feed
From: Laurenz Albe <[email protected]>
To: Pawel Kudzia <[email protected]>
To: [email protected]
Subject: Re: Silent data corruption in PostgreSQL 17 - how to detect it proactively?
Date: Sun, 14 Sep 2025 15:40:08 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAJYBUS9pUjZbqsndbjdHUNugonfQ4DRr_HEn0QkyTXsds0VLMQ@mail.gmail.com>
References: <CAJYBUS9T=JCVGkb9yXn5GVYC_g-e5mRi4MFoNHKEZV0keuHHwQ@mail.gmail.com>
	<[email protected]>
	<CAJYBUS9pUjZbqsndbjdHUNugonfQ4DRr_HEn0QkyTXsds0VLMQ@mail.gmail.com>

On Sun, 2025-09-14 at 14:47 +0200, Pawel Kudzia wrote:
> On Sun, Sep 14, 2025 at 12:35 PM Laurenz Albe <[email protected]> wrote:
> > On Sun, 2025-09-14 at 10:30 +0200, Pawel Kudzia wrote:
> > > I've hit a silent data corruption for TOAST data - leading to some
> > > infinite loop when accessing bytea column for very particular row. I
> > > did not suffer data loss - data from streaming replica was fine, I've
> > > used it to rebuild the main server.
> > > 
> > > I'm wondering if there's any proactive way of detecting that type of
> > > an issue rather than discovering pile-up of SELECT queries leading to
> > > CPU starvation or finding hanged backup jobs.
> > > 
> > > PostgreSQL's log does not have any error messages indicating data corruption.
> > > 
> > > This server does not have checksums enabled, but - as I understand -
> > > such checksums cannot be checked online anyway. So - how can I detect
> > > similar corruption?
> > 
> > A proactive way of detecting TOAST corruption... how about
> > 
> >   pg_dump -f /dev/null yourdatabase
> > 
> > If there is TOAST corruption, that should give you an error.
> 
> Thanks for your answer! pg_dump mydatabase ends with ~the same as the
> SELECT statement.
> It hangs leaving 'postgres: 17/main: postgres application_cache
> [local] COPY' using 100% of single CPU core and no messages in PG's
> logs.
> 
> I was curious if there's any other tool that can spot that type of
> corruption and state it explicitly rather than guess it by monitoring
> execution time.

Data checksums would certainly be a step in the right direction.
They are checked whenever the data are read from storage.

There is no tool that will reliably detect all kinds of data corruption.

Another tool ar your disposal are the functions from the amcheck
contrib module (or the pg_amcheck executable that calls them).

Yours,
Laurenz Albe






view thread (4+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Silent data corruption in PostgreSQL 17 - how to detect it proactively?
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox