public inbox for [email protected]
help / color / mirror / Atom feedIntermittent missing chunk ERRORS
12+ messages / 4 participants
[nested] [flat]
* Intermittent missing chunk ERRORS
@ 2025-04-26 17:09 Sbob <[email protected]>
2025-04-26 20:04 ` Re: Intermittent missing chunk ERRORS Vijaykumar Jain <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
0 siblings, 2 replies; 12+ messages in thread
From: Sbob @ 2025-04-26 17:09 UTC (permalink / raw)
To: [email protected]
All;
I am working with a client, they have an app that is running into
error's like this:
_SQL Error [XX001]: ERROR: missing chunk number 0 for toast value
44915477 in pg_toast_5321896
_
I thought it was a corruption / REINDEX issue, however I have run
REINDEXon the index, REINDEX on the table and even a reindexdb on the
database. Today they passed me another error (the one above) and once I
was able to login I ran the same query the app team referenced and it
ran without any errors. I asked them to re-run the query and they
informed me that it now works for them as well.
I assume that if this were a corrupt index issue then the query would
continue to fail every time.
Has anyone seen this type of intermittent index error before? and
thoughts on where to look /how to debug this?
Thanks in advance
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-26 20:04 ` Vijaykumar Jain <[email protected]>
2025-04-26 20:12 ` Re: Intermittent missing chunk ERRORS Vijaykumar Jain <[email protected]>
1 sibling, 1 reply; 12+ messages in thread
From: Vijaykumar Jain @ 2025-04-26 20:04 UTC (permalink / raw)
To: Sbob <[email protected]>; +Cc: Pgsql-admin <[email protected]>
a lot in this gist is something you can debug to start with.
https://gist.github.com/supix/80f9a6111dc954cf38ee99b9dedf187a
if not the index than the toast table might be corrupt.
the bigger problem might be to figure out how it got corrupt. a one off
power incident, or bug or storage is having issues etc.
once that is clear,
a simple start would be to run a pg_dump of the database to /dev/null and
check if the dump is clean without errors.
Thanks,
Vijay
Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada;
On Sat, Apr 26, 2025, 10:39 PM Sbob <[email protected]> wrote:
> All;
>
>
> I am working with a client, they have an app that is running into error's
> like this:
>
>
> *SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477
> in pg_toast_5321896 *
>
>
> I thought it was a corruption / REINDEX issue, however I have run
> REINDEXon the index, REINDEX on the table and even a reindexdb on the
> database. Today they passed me another error (the one above) and once I was
> able to login I ran the same query the app team referenced and it ran
> without any errors. I asked them to re-run the query and they informed me
> that it now works for them as well.
>
>
> I assume that if this were a corrupt index issue then the query would
> continue to fail every time.
>
> Has anyone seen this type of intermittent index error before? and thoughts
> on where to look /how to debug this?
>
>
> Thanks in advance
>
>
>
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:04 ` Re: Intermittent missing chunk ERRORS Vijaykumar Jain <[email protected]>
@ 2025-04-26 20:12 ` Vijaykumar Jain <[email protected]>
0 siblings, 0 replies; 12+ messages in thread
From: Vijaykumar Jain @ 2025-04-26 20:12 UTC (permalink / raw)
To: Sbob <[email protected]>; +Cc: Pgsql-admin <[email protected]>
forgot about the core tool to check for the problem.
https://www.postgresql.org/docs/current/amcheck.html
this can help find corruption issues much faster.
Thanks,
Vijay
Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada;
On Sun, Apr 27, 2025, 1:34 AM Vijaykumar Jain <
[email protected]> wrote:
> a lot in this gist is something you can debug to start with.
>
> https://gist.github.com/supix/80f9a6111dc954cf38ee99b9dedf187a
>
> if not the index than the toast table might be corrupt.
> the bigger problem might be to figure out how it got corrupt. a one off
> power incident, or bug or storage is having issues etc.
> once that is clear,
> a simple start would be to run a pg_dump of the database to /dev/null and
> check if the dump is clean without errors.
>
> Thanks,
> Vijay
>
> Open to work
> Resume - Vijaykumar Jain <https://github.com/cabecada;
>
> On Sat, Apr 26, 2025, 10:39 PM Sbob <[email protected]> wrote:
>
>> All;
>>
>>
>> I am working with a client, they have an app that is running into error's
>> like this:
>>
>>
>> *SQL Error [XX001]: ERROR: missing chunk number 0 for toast value
>> 44915477 in pg_toast_5321896 *
>>
>>
>> I thought it was a corruption / REINDEX issue, however I have run
>> REINDEXon the index, REINDEX on the table and even a reindexdb on the
>> database. Today they passed me another error (the one above) and once I was
>> able to login I ran the same query the app team referenced and it ran
>> without any errors. I asked them to re-run the query and they informed me
>> that it now works for them as well.
>>
>>
>> I assume that if this were a corrupt index issue then the query would
>> continue to fail every time.
>>
>> Has anyone seen this type of intermittent index error before? and
>> thoughts on where to look /how to debug this?
>>
>>
>> Thanks in advance
>>
>>
>>
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-26 20:24 ` Laurenz Albe <[email protected]>
2025-04-26 20:28 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
2025-04-26 22:54 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
1 sibling, 2 replies; 12+ messages in thread
From: Laurenz Albe @ 2025-04-26 20:24 UTC (permalink / raw)
To: Sbob <[email protected]>; [email protected]
On Sat, 2025-04-26 at 11:09 -0600, Sbob wrote:
> I am working with a client, they have an app that is running into error's like this:
>
> SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477 in pg_toast_5321896
>
> I thought it was a corruption / REINDEX issue, however I have run REINDEXon the index,
> REINDEX on the table and even a reindexdb on the database. Today they passed me another
> error (the one above) and once I was able to login I ran the same query the app team
> referenced and it ran without any errors. I asked them to re-run the query and they
> informed me that it now works for them as well.
>
> I assume that if this were a corrupt index issue then the query would continue to fail
> every time.
>
> Has anyone seen this type of intermittent index error before? and thoughts on where to
> look /how to debug this?
The error is data corruption.
If it is transient, that could mean a couple of things:
- a TOAST index was corrupted, and REINDEX fixed it
- faulty RAM corrupted data, but as soon as the corrupted buffer was evicted,
the problem was gone
- somebody updated or deleted the row, and the corruption is hidden
Yours,
Laurenz Albe
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
@ 2025-04-26 20:28 ` Tom Lane <[email protected]>
2025-04-26 22:49 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
1 sibling, 1 reply; 12+ messages in thread
From: Tom Lane @ 2025-04-26 20:28 UTC (permalink / raw)
To: Laurenz Albe <[email protected]>; +Cc: Sbob <[email protected]>; [email protected]
Laurenz Albe <[email protected]> writes:
> On Sat, 2025-04-26 at 11:09 -0600, Sbob wrote:
>> SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477 in pg_toast_5321896
>> Has anyone seen this type of intermittent index error before? and thoughts on where to
>> look /how to debug this?
> The error is data corruption.
We've seen transient errors of this sort arise from what are basically
timing problems, and fixed at least a few cases. I wonder exactly
which PG version the problematic installation is running.
regards, tom lane
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
2025-04-26 20:28 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
@ 2025-04-26 22:49 ` Sbob <[email protected]>
2025-04-27 00:03 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
0 siblings, 1 reply; 12+ messages in thread
From: Sbob @ 2025-04-26 22:49 UTC (permalink / raw)
To: Tom Lane <[email protected]>; Laurenz Albe <[email protected]>; +Cc: [email protected]
On 4/26/25 2:28 PM, Tom Lane wrote:
> Laurenz Albe <[email protected]> writes:
>> On Sat, 2025-04-26 at 11:09 -0600, Sbob wrote:
>>> SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477 in pg_toast_5321896
>>> Has anyone seen this type of intermittent index error before? and thoughts on where to
>>> look /how to debug this?
>> The error is data corruption.
> We've seen transient errors of this sort arise from what are basically
> timing problems, and fixed at least a few cases. I wonder exactly
> which PG version the problematic installation is running.
>
> regards, tom lane
We are on V14
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
2025-04-26 20:28 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
2025-04-26 22:49 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-27 00:03 ` Tom Lane <[email protected]>
2025-04-27 16:45 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
0 siblings, 1 reply; 12+ messages in thread
From: Tom Lane @ 2025-04-27 00:03 UTC (permalink / raw)
To: Sbob <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]
Sbob <[email protected]> writes:
> On 4/26/25 2:28 PM, Tom Lane wrote:
>> We've seen transient errors of this sort arise from what are basically
>> timing problems, and fixed at least a few cases. I wonder exactly
>> which PG version the problematic installation is running.
> We are on V14
14-what? We are talking about bug fixes, so it matters.
regards, tom lane
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
2025-04-26 20:28 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
2025-04-26 22:49 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-27 00:03 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
@ 2025-04-27 16:45 ` Sbob <[email protected]>
2025-04-27 18:25 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
0 siblings, 1 reply; 12+ messages in thread
From: Sbob @ 2025-04-27 16:45 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]
On 4/26/25 6:03 PM, Tom Lane wrote:
> Sbob <[email protected]> writes:
>> On 4/26/25 2:28 PM, Tom Lane wrote:
>>> We've seen transient errors of this sort arise from what are basically
>>> timing problems, and fixed at least a few cases. I wonder exactly
>>> which PG version the problematic installation is running.
>> We are on V14
> 14-what? We are talking about bug fixes, so it matters.
>
> regards, tom lane
postgres=# select version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 14.17 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
20210514 (Red Hat 8.5.0-23), 64-bit
(1 row)
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
2025-04-26 20:28 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
2025-04-26 22:49 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-27 00:03 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
2025-04-27 16:45 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-27 18:25 ` Tom Lane <[email protected]>
2025-04-28 14:10 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
0 siblings, 1 reply; 12+ messages in thread
From: Tom Lane @ 2025-04-27 18:25 UTC (permalink / raw)
To: Sbob <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]
Sbob <[email protected]> writes:
> On 4/26/25 6:03 PM, Tom Lane wrote:
>> 14-what? We are talking about bug fixes, so it matters.
> PostgreSQL 14.17 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
> 20210514 (Red Hat 8.5.0-23), 64-bit
OK, well that's current at least...
I don't suppose you can build a test case that reproduces this
failure? If it's timing-dependent as I suspect, it might only fail
once in awhile, but that would be good enough for investigation.
regards, tom lane
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
2025-04-26 20:28 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
2025-04-26 22:49 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-27 00:03 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
2025-04-27 16:45 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-27 18:25 ` Re: Intermittent missing chunk ERRORS Tom Lane <[email protected]>
@ 2025-04-28 14:10 ` Sbob <[email protected]>
0 siblings, 0 replies; 12+ messages in thread
From: Sbob @ 2025-04-28 14:10 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]
On 4/27/25 12:25 PM, Tom Lane wrote:
> Sbob <[email protected]> writes:
>> On 4/26/25 6:03 PM, Tom Lane wrote:
>>> 14-what? We are talking about bug fixes, so it matters.
>> PostgreSQL 14.17 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
>> 20210514 (Red Hat 8.5.0-23), 64-bit
> OK, well that's current at least...
>
> I don't suppose you can build a test case that reproduces this
> failure? If it's timing-dependent as I suspect, it might only fail
> once in awhile, but that would be good enough for investigation.
>
> regards, tom lane
Yeah, it only fails once in awhile, it seems like it fairly often but
building a case to recreate it will likely be difficult. I will continue
to work with the app team and see if I can find a pattern / process that
recreates it.
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
@ 2025-04-26 22:54 ` Sbob <[email protected]>
2025-04-27 19:52 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
1 sibling, 1 reply; 12+ messages in thread
From: Sbob @ 2025-04-26 22:54 UTC (permalink / raw)
To: Laurenz Albe <[email protected]>; [email protected]
On 4/26/25 2:24 PM, Laurenz Albe wrote:
> On Sat, 2025-04-26 at 11:09 -0600, Sbob wrote:
>> I am working with a client, they have an app that is running into error's like this:
>>
>> SQL Error [XX001]: ERROR: missing chunk number 0 for toast value 44915477 in pg_toast_5321896
>>
>> I thought it was a corruption / REINDEX issue, however I have run REINDEXon the index,
>> REINDEX on the table and even a reindexdb on the database. Today they passed me another
>> error (the one above) and once I was able to login I ran the same query the app team
>> referenced and it ran without any errors. I asked them to re-run the query and they
>> informed me that it now works for them as well.
>>
>> I assume that if this were a corrupt index issue then the query would continue to fail
>> every time.
>>
>> Has anyone seen this type of intermittent index error before? and thoughts on where to
>> look /how to debug this?
> The error is data corruption.
>
> If it is transient, that could mean a couple of things:
>
> - a TOAST index was corrupted, and REINDEX fixed it
>
> - faulty RAM corrupted data, but as soon as the corrupted buffer was evicted,
> the problem was gone
>
> - somebody updated or deleted the row, and the corruption is hidden
>
> Yours,
> Laurenz Albe
I did REINDEX and the error came back, however this last time when they
informed me about the error they sent the select statement throwing the
error and once I logged in I ran their select statement and it worked,
no REINDEX had been run, and then I asked them to run it again and it
worked for them as well
So, the first option above is not the answer and I dont think anyone
updated the row, their process is mostly reads but I will verify, so
maybe the RAM issue? the servers are cloud VM's in a private cloud
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Intermittent missing chunk ERRORS
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:24 ` Re: Intermittent missing chunk ERRORS Laurenz Albe <[email protected]>
2025-04-26 22:54 ` Re: Intermittent missing chunk ERRORS Sbob <[email protected]>
@ 2025-04-27 19:52 ` Laurenz Albe <[email protected]>
0 siblings, 0 replies; 12+ messages in thread
From: Laurenz Albe @ 2025-04-27 19:52 UTC (permalink / raw)
To: Sbob <[email protected]>; [email protected]
On Sat, 2025-04-26 at 16:54 -0600, Sbob wrote:
> > If it is transient, that could mean a couple of things:
> >
> > - a TOAST index was corrupted, and REINDEX fixed it
> >
> > - faulty RAM corrupted data, but as soon as the corrupted buffer was evicted,
> > the problem was gone
> >
> > - somebody updated or deleted the row, and the corruption is hidden
>
> I did REINDEX and the error came back, however this last time when they
> informed me about the error they sent the select statement throwing the
> error and once I logged in I ran their select statement and it worked,
> no REINDEX had been run, and then I asked them to run it again and it
> worked for them as well
>
> So, the first option above is not the answer and I dont think anyone
> updated the row, their process is mostly reads but I will verify, so
> maybe the RAM issue? the servers are cloud VM's in a private cloud
Maybe. Check the hardware.
Or maybe it is as Tom says.
Yours,
Laurenz Albe
^ permalink raw reply [nested|flat] 12+ messages in thread
end of thread, other threads:[~2025-04-28 14:10 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-04-26 17:09 Intermittent missing chunk ERRORS Sbob <[email protected]>
2025-04-26 20:04 ` Vijaykumar Jain <[email protected]>
2025-04-26 20:12 ` Vijaykumar Jain <[email protected]>
2025-04-26 20:24 ` Laurenz Albe <[email protected]>
2025-04-26 20:28 ` Tom Lane <[email protected]>
2025-04-26 22:49 ` Sbob <[email protected]>
2025-04-27 00:03 ` Tom Lane <[email protected]>
2025-04-27 16:45 ` Sbob <[email protected]>
2025-04-27 18:25 ` Tom Lane <[email protected]>
2025-04-28 14:10 ` Sbob <[email protected]>
2025-04-26 22:54 ` Sbob <[email protected]>
2025-04-27 19:52 ` Laurenz Albe <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox