public inbox for [email protected]  
help / color / mirror / Atom feed
Intermittent missing chunk ERRORS
12+ messages / 4 participants
[nested] [flat]

* Intermittent missing chunk ERRORS
@ 2025-04-26 17:09 Sbob <[email protected]>
  2025-04-26 20:04 ` Re: Intermittent missing chunk ERRORS Vijaykumar Jain <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
  0 siblings, 2 replies; 12+ messages in thread

From: Sbob @ 2025-04-26 17:09 UTC (permalink / raw)
  To: [email protected]

All;


I am working with a client, they have an app that is running into 
error's like this:

_SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 
44915477 in pg_toast_5321896
_


I thought it was a corruption / REINDEX issue, however I have run 
REINDEXon the index, REINDEX on the table and even a reindexdb on the 
database. Today they passed me another error (the one above) and once I 
was able to login I ran the same query the app team referenced and it 
ran without any errors. I asked them to re-run the query and they 
informed me that it now works for them as well.


I assume that if this were a corrupt index issue then the query would 
continue to fail every time.

Has anyone seen this type of intermittent index error before? and 
thoughts on where to look /how to debug this?


Thanks in advance



^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-26 20:04 ` Vijaykumar Jain <[email protected]>
  2025-04-26 20:12   ` Re: Intermittent missing chunk ERRORS Vijaykumar Jain <[email protected]>
  1 sibling, 1 reply; 12+ messages in thread

From: Vijaykumar Jain @ 2025-04-26 20:04 UTC (permalink / raw)
  To: Sbob <[email protected]>; +Cc: Pgsql-admin <[email protected]>

a lot in this gist is something you can debug to start with.

https://gist.github.com/supix/80f9a6111dc954cf38ee99b9dedf187a

if not the index than the toast table might be corrupt.
the bigger problem might be to figure out how it got corrupt. a one off
power incident, or bug or storage is having issues etc.
once that is clear,
a simple start would be to run a pg_dump of the database to /dev/null and
check if the dump is clean without errors.

Thanks,
Vijay

Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada;

On Sat, Apr 26, 2025, 10:39 PM Sbob <[email protected]> wrote:

> All;
>
>
> I am working with a client, they have an app that is running into error's
> like this:
>
>
> *SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477
> in pg_toast_5321896 *
>
>
> I thought it was a corruption / REINDEX issue, however I have run
> REINDEXon the index, REINDEX on the table and even a reindexdb on the
> database. Today they passed me another error (the one above) and once I was
> able to login I ran the same query the app team referenced and it ran
> without any errors. I asked them to re-run the query and they informed me
> that it now works for them as well.
>
>
> I assume that if this were a corrupt index issue then the query would
> continue to fail every time.
>
> Has anyone seen this type of intermittent index error before? and thoughts
> on where to look /how to debug this?
>
>
> Thanks in advance
>
>
>


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:04 ` Re: Intermittent missing chunk ERRORS Vijaykumar Jain <[email protected]>
@ 2025-04-26 20:12   ` Vijaykumar Jain <[email protected]>
  0 siblings, 0 replies; 12+ messages in thread

From: Vijaykumar Jain @ 2025-04-26 20:12 UTC (permalink / raw)
  To: Sbob <[email protected]>; +Cc: Pgsql-admin <[email protected]>

forgot about the core tool to check for the problem.

https://www.postgresql.org/docs/current/amcheck.html

this can help find corruption issues much faster.


Thanks,
Vijay

Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada;

On Sun, Apr 27, 2025, 1:34 AM Vijaykumar Jain <
[email protected]> wrote:

> a lot in this gist is something you can debug to start with.
>
> https://gist.github.com/supix/80f9a6111dc954cf38ee99b9dedf187a
>
> if not the index than the toast table might be corrupt.
> the bigger problem might be to figure out how it got corrupt. a one off
> power incident, or bug or storage is having issues etc.
> once that is clear,
> a simple start would be to run a pg_dump of the database to /dev/null and
> check if the dump is clean without errors.
>
> Thanks,
> Vijay
>
> Open to work
> Resume - Vijaykumar Jain <https://github.com/cabecada;
>
> On Sat, Apr 26, 2025, 10:39 PM Sbob <[email protected]> wrote:
>
>> All;
>>
>>
>> I am working with a client, they have an app that is running into error's
>> like this:
>>
>>
>> *SQL Error [XX001]: ERROR: missing chunk number 0 for toast value
>> 44915477 in pg_toast_5321896 *
>>
>>
>> I thought it was a corruption / REINDEX issue, however I have run
>> REINDEXon the index, REINDEX on the table and even a reindexdb on the
>> database. Today they passed me another error (the one above) and once I was
>> able to login I ran the same query the app team referenced and it ran
>> without any errors. I asked them to re-run the query and they informed me
>> that it now works for them as well.
>>
>>
>> I assume that if this were a corrupt index issue then the query would
>> continue to fail every time.
>>
>> Has anyone seen this type of intermittent index error before? and
>> thoughts on where to look /how to debug this?
>>
>>
>> Thanks in advance
>>
>>
>>


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-26 20:24 ` Laurenz Albe <[email protected]>
  2025-04-26 20:28   ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  2025-04-26 22:54   ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
  1 sibling, 2 replies; 12+ messages in thread

From: Laurenz Albe @ 2025-04-26 20:24 UTC (permalink / raw)
  To: Sbob <[email protected]>; [email protected]

On Sat, 2025-04-26 at 11:09 -0600, Sbob wrote:
> I am working with a client, they have an app that is running into error's like this:
> 
> SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477 in pg_toast_5321896 
> 
> I thought it was a corruption / REINDEX issue, however I have run REINDEXon the index,
> REINDEX on the table and even a reindexdb on the database. Today they passed me another
> error (the one above) and once I was able to login I ran the same query the app team
> referenced and it ran without any errors. I asked them to re-run the query and they
> informed me that it now works for them as well.
> 
> I assume that if this were a corrupt index issue then the query would continue to fail
> every time.
> 
> Has anyone seen this type of intermittent index error before? and thoughts on where to
> look /how to debug this?

The error is data corruption.

If it is transient, that could mean a couple of things:

- a TOAST index was corrupted, and REINDEX fixed it

- faulty RAM corrupted data, but as soon as the corrupted buffer was evicted,
  the problem was gone

- somebody updated or deleted the row, and the corruption is hidden

Yours,
Laurenz Albe





^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
@ 2025-04-26 20:28   ` Tom Lane <[email protected]>
  2025-04-26 22:49     ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
  1 sibling, 1 reply; 12+ messages in thread

From: Tom Lane @ 2025-04-26 20:28 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Sbob <[email protected]>; [email protected]

Laurenz Albe <[email protected]> writes:
> On Sat, 2025-04-26 at 11:09 -0600, Sbob wrote:
>> SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477 in pg_toast_5321896 
>> Has anyone seen this type of intermittent index error before? and thoughts on where to
>> look /how to debug this?

> The error is data corruption.

We've seen transient errors of this sort arise from what are basically
timing problems, and fixed at least a few cases.  I wonder exactly
which PG version the problematic installation is running.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
  2025-04-26 20:28   ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
@ 2025-04-26 22:49     ` Sbob <[email protected]>
  2025-04-27 00:03       ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: Sbob @ 2025-04-26 22:49 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; Laurenz Albe <[email protected]>; +Cc: [email protected]


On 4/26/25 2:28 PM, Tom Lane wrote:
> Laurenz Albe <[email protected]> writes:
>> On Sat, 2025-04-26 at 11:09 -0600, Sbob wrote:
>>> SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477 in pg_toast_5321896
>>> Has anyone seen this type of intermittent index error before? and thoughts on where to
>>> look /how to debug this?
>> The error is data corruption.
> We've seen transient errors of this sort arise from what are basically
> timing problems, and fixed at least a few cases.  I wonder exactly
> which PG version the problematic installation is running.
>
> 			regards, tom lane


We are on V14







^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
  2025-04-26 20:28   ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  2025-04-26 22:49     ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-27 00:03       ` Tom Lane <[email protected]>
  2025-04-27 16:45         ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: Tom Lane @ 2025-04-27 00:03 UTC (permalink / raw)
  To: Sbob <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]

Sbob <[email protected]> writes:
> On 4/26/25 2:28 PM, Tom Lane wrote:
>> We've seen transient errors of this sort arise from what are basically
>> timing problems, and fixed at least a few cases.  I wonder exactly
>> which PG version the problematic installation is running.

> We are on V14

14-what?  We are talking about bug fixes, so it matters.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
  2025-04-26 20:28   ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  2025-04-26 22:49     ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-27 00:03       ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
@ 2025-04-27 16:45         ` Sbob <[email protected]>
  2025-04-27 18:25           ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: Sbob @ 2025-04-27 16:45 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]


On 4/26/25 6:03 PM, Tom Lane wrote:
> Sbob <[email protected]> writes:
>> On 4/26/25 2:28 PM, Tom Lane wrote:
>>> We've seen transient errors of this sort arise from what are basically
>>> timing problems, and fixed at least a few cases.  I wonder exactly
>>> which PG version the problematic installation is running.
>> We are on V14
> 14-what?  We are talking about bug fixes, so it matters.
>
> 			regards, tom lane


postgres=# select version();
version
----------------------------------------------------------------------------------------------------------
  PostgreSQL 14.17 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 
20210514 (Red Hat 8.5.0-23), 64-bit
(1 row)







^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
  2025-04-26 20:28   ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  2025-04-26 22:49     ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-27 00:03       ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  2025-04-27 16:45         ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-27 18:25           ` Tom Lane <[email protected]>
  2025-04-28 14:10             ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: Tom Lane @ 2025-04-27 18:25 UTC (permalink / raw)
  To: Sbob <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]

Sbob <[email protected]> writes:
> On 4/26/25 6:03 PM, Tom Lane wrote:
>> 14-what?  We are talking about bug fixes, so it matters.

>   PostgreSQL 14.17 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 
> 20210514 (Red Hat 8.5.0-23), 64-bit

OK, well that's current at least...

I don't suppose you can build a test case that reproduces this
failure?  If it's timing-dependent as I suspect, it might only fail
once in awhile, but that would be good enough for investigation.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
  2025-04-26 20:28   ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  2025-04-26 22:49     ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-27 00:03       ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
  2025-04-27 16:45         ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-27 18:25           ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
@ 2025-04-28 14:10             ` Sbob <[email protected]>
  0 siblings, 0 replies; 12+ messages in thread

From: Sbob @ 2025-04-28 14:10 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]


On 4/27/25 12:25 PM, Tom Lane wrote:
> Sbob <[email protected]> writes:
>> On 4/26/25 6:03 PM, Tom Lane wrote:
>>> 14-what?  We are talking about bug fixes, so it matters.
>>    PostgreSQL 14.17 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
>> 20210514 (Red Hat 8.5.0-23), 64-bit
> OK, well that's current at least...
>
> I don't suppose you can build a test case that reproduces this
> failure?  If it's timing-dependent as I suspect, it might only fail
> once in awhile, but that would be good enough for investigation.
>
> 			regards, tom lane


Yeah, it only fails once in awhile, it seems like it fairly often but 
building a case to recreate it will likely be difficult. I will continue 
to work with the app team and see if I can find a pattern / process that 
recreates it.








^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
@ 2025-04-26 22:54   ` Sbob <[email protected]>
  2025-04-27 19:52     ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
  1 sibling, 1 reply; 12+ messages in thread

From: Sbob @ 2025-04-26 22:54 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; [email protected]


On 4/26/25 2:24 PM, Laurenz Albe wrote:
> On Sat, 2025-04-26 at 11:09 -0600, Sbob wrote:
>> I am working with a client, they have an app that is running into error's like this:
>>
>> SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477 in pg_toast_5321896
>>
>> I thought it was a corruption / REINDEX issue, however I have run REINDEXon the index,
>> REINDEX on the table and even a reindexdb on the database. Today they passed me another
>> error (the one above) and once I was able to login I ran the same query the app team
>> referenced and it ran without any errors. I asked them to re-run the query and they
>> informed me that it now works for them as well.
>>
>> I assume that if this were a corrupt index issue then the query would continue to fail
>> every time.
>>
>> Has anyone seen this type of intermittent index error before? and thoughts on where to
>> look /how to debug this?
> The error is data corruption.
>
> If it is transient, that could mean a couple of things:
>
> - a TOAST index was corrupted, and REINDEX fixed it
>
> - faulty RAM corrupted data, but as soon as the corrupted buffer was evicted,
>    the problem was gone
>
> - somebody updated or deleted the row, and the corruption is hidden
>
> Yours,
> Laurenz Albe


I did REINDEX and the error came back, however this last time when they 
informed me about the error they sent the select statement throwing the 
error and once I logged in I ran their select statement and it worked, 
no REINDEX had been run, and then I asked them to run it again and it 
worked for them as well

So, the first option above is not the answer and I dont think anyone 
updated the row, their process is mostly reads but I will verify, so 
maybe the RAM issue? the servers are cloud VM's in a private cloud








^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Intermittent missing chunk ERRORS
  2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
  2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
  2025-04-26 22:54   ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-27 19:52     ` Laurenz Albe <[email protected]>
  0 siblings, 0 replies; 12+ messages in thread

From: Laurenz Albe @ 2025-04-27 19:52 UTC (permalink / raw)
  To: Sbob <[email protected]>; [email protected]

On Sat, 2025-04-26 at 16:54 -0600, Sbob wrote:
> > If it is transient, that could mean a couple of things:
> > 
> > - a TOAST index was corrupted, and REINDEX fixed it
> > 
> > - faulty RAM corrupted data, but as soon as the corrupted buffer was evicted,
> >     the problem was gone
> > 
> > - somebody updated or deleted the row, and the corruption is hidden
> 
> I did REINDEX and the error came back, however this last time when they 
> informed me about the error they sent the select statement throwing the 
> error and once I logged in I ran their select statement and it worked, 
> no REINDEX had been run, and then I asked them to run it again and it 
> worked for them as well
> 
> So, the first option above is not the answer and I dont think anyone 
> updated the row, their process is mostly reads but I will verify, so 
> maybe the RAM issue? the servers are cloud VM's in a private cloud

Maybe.  Check the hardware.
Or maybe it is as Tom says.

Yours,
Laurenz Albe





^ permalink  raw  reply  [nested|flat] 12+ messages in thread


end of thread, other threads:[~2025-04-28 14:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:04 ` Vijaykumar Jain <[email protected]>
2025-04-26 20:12   ` Vijaykumar Jain <[email protected]>
2025-04-26 20:24 ` Laurenz Albe <[email protected]>
2025-04-26 20:28   ` Tom Lane <[email protected]>
2025-04-26 22:49     ` Sbob <[email protected]>
2025-04-27 00:03       ` Tom Lane <[email protected]>
2025-04-27 16:45         ` Sbob <[email protected]>
2025-04-27 18:25           ` Tom Lane <[email protected]>
2025-04-28 14:10             ` Sbob <[email protected]>
2025-04-26 22:54   ` Sbob <[email protected]>
2025-04-27 19:52     ` Laurenz Albe <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox