public inbox for [email protected]
help / color / mirror / Atom feedpostgres server crash with "Segmentation fault"
6+ messages / 3 participants
[nested] [flat]
* postgres server crash with "Segmentation fault"
@ 2025-10-16 11:07 Ishan Arunkumar Joshi <[email protected]>
2025-10-17 00:52 ` Re: postgres server crash with "Segmentation fault" Willian Colognesi <[email protected]>
2025-10-17 06:49 ` Re: postgres server crash with "Segmentation fault" Laurenz Albe <[email protected]>
0 siblings, 2 replies; 6+ messages in thread
From: Ishan Arunkumar Joshi @ 2025-10-16 11:07 UTC (permalink / raw)
To: pgsql-admin
Hi team,
We are using PG16.9 in Patroni Postgres setup in production. Last night we have face an issue where postgres server got crash with "Segmentation fault" on table with auto vacuum task. Interestingly in standby node while we perform vacuum on same table standby node was also got crash. The table was not able to get query on the table as while executing select statement also crash the database.
we have observed few error prior to crash for same table. (Table name and function details change for purpose)
"ERROR : Error occurred at function get_details page 117 of relation ""impacted_table"" should be empty but is not"
During the same time the other table also getting below error. However once the database restart, we were not getting any issue for table oid= 1108029
"ERROR : Error occurred at function get_details unexpected data beyond EOF in block 16276 of relation base/33195/1108029"
At last it got failed as follows
2025-10-15 02:50:52.428 [432443]LOG: terminating any other active server processes"
2025-10-15 02:50:52.428 [432443]DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE schema.impacted_table"
2025-10-15 02:50:52.428 [432443]LOG: server process (PID 390906) was terminated by signal 11: Segmentation fault"
2025-10-15 02:50:55.475 [432443]LOG: all server processes terminated; reinitializing"
2025-10-15 02:51:32.575 [432443]LOG: received immediate shutdown request"
2025-10-15 02:51:32.629 [432443]LOG: database system is shut down"
The function which was executing having truncate table and insert/update statement executing on this table. As this is normally functionality but we are suspecting it during the execution runtime it corrupt the shared memory and data inflight which got replicated to Replica node/DR site and corrupt the same table.
We had to drop the table "impacted_table" from database once the database is up.
however we are not able to identify exact root cause behind "segmentation fault" error for this table and need expert advice to find the root case and also need suggestions to prevention steps.
Thanks & Regards,
-------------------------
Ishan Joshi
________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: postgres server crash with "Segmentation fault"
2025-10-16 11:07 postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
@ 2025-10-17 00:52 ` Willian Colognesi <[email protected]>
2025-10-17 11:03 ` RE: postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
1 sibling, 1 reply; 6+ messages in thread
From: Willian Colognesi @ 2025-10-17 00:52 UTC (permalink / raw)
To: Ishan Arunkumar Joshi <[email protected]>; +Cc: pgsql-admin
Not sure if it's the same problem but I saw segmentation faults in the past
when jit was enabled in postgres. After disabled it never happened again.
Willian Colognesi | Principal Engineer
[email protected]
R. Adhemar Pereira de Barros, 1500, 14° floor, Londrina/PR | Brazil
On Thu, Oct 16, 2025, 18:54 Ishan Arunkumar Joshi <
[email protected]> wrote:
> Hi team,
>
>
>
> We are using PG16.9 in Patroni Postgres setup in production. Last night we
> have face an issue where postgres server got crash with “Segmentation
> fault” on table with auto vacuum task. Interestingly in standby node while
> we perform vacuum on same table standby node was also got crash. The table
> was not able to get query on the table as while executing select statement
> also crash the database.
>
> we have observed few error prior to crash for same table. (Table name and
> function details change for purpose)
>
>
> "ERROR : Error occurred at function get_details page 117 of relation
> ""impacted_table"" should be empty but is not"
>
>
>
> During the same time the other table also getting below error. However
> once the database restart, we were not getting any issue for table oid=
> 1108029
>
>
> “ERROR : Error occurred at function get_details unexpected data beyond EOF
> in block 16276 of relation base/33195/1108029"
>
>
>
> At last it got failed as follows
>
> 2025-10-15 02:50:52.428 [432443]LOG: terminating any other active server
> processes"
>
> 2025-10-15 02:50:52.428 [432443]DETAIL: Failed process was running:
> autovacuum: VACUUM ANALYZE schema.impacted_table"
>
> 2025-10-15 02:50:52.428 [432443]LOG: server process (PID 390906) was
> terminated by *signal 11: Segmentation fault"*
>
> 2025-10-15 02:50:55.475 [432443]LOG: all server processes terminated;
> reinitializing"
>
> 2025-10-15 02:51:32.575 [432443]LOG: received immediate shutdown request"
>
> 2025-10-15 02:51:32.629 [432443]LOG: database system is shut down"
>
>
>
> The function which was executing having truncate table and insert/update
> statement executing on this table. As this is normally functionality but
> we are suspecting it during the execution runtime it corrupt the shared
> memory and data inflight which got replicated to Replica node/DR site and
> corrupt the same table.
>
>
>
> We had to drop the table “impacted_table” from database once the database
> is up.
>
> however we are not able to identify exact root cause behind “segmentation
> fault” error for this table and need expert advice to find the root case
> and also need suggestions to prevention steps.
>
>
>
> *Thanks & Regards,*
>
> -------------------------
> Ishan Joshi
>
>
>
>
>
> ------------------------------
> The information transmitted herein is intended only for the person or
> entity to which it is addressed and may contain confidential, proprietary
> and/or privileged material. Any review, retransmission, dissemination or
> other use of, or taking of any action in reliance upon, this information by
> persons or entities other than the intended recipient is prohibited. If you
> received this in error, please contact the sender and delete the material
> from any computer.
>
>
^ permalink raw reply [nested|flat] 6+ messages in thread
* RE: postgres server crash with "Segmentation fault"
2025-10-16 11:07 postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
2025-10-17 00:52 ` Re: postgres server crash with "Segmentation fault" Willian Colognesi <[email protected]>
@ 2025-10-17 11:03 ` Ishan Arunkumar Joshi <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: Ishan Arunkumar Joshi @ 2025-10-17 11:03 UTC (permalink / raw)
To: Willian Colognesi <[email protected]>; pgsql-admin
Hi Willian,
Thanks for the response. Yes I found the thread where it suggesting to disable JIT. However it is not I am not sure how JIT impact to resolve it. I assume the last occurrence on thread mentioned that even after stopping the JIT segmentation fault was happened. So I believe it is not a root cause and solution.
Thanks & Regards,
-------------------------
Ishan Joshi
From: Willian Colognesi <[email protected]>
Sent: Friday, October 17, 2025 6:23 AM
To: Ishan Arunkumar Joshi <[email protected]>
Cc: [email protected]
Subject: Re: postgres server crash with "Segmentation fault"
[External Email]
________________________________
Not sure if it's the same problem but I saw segmentation faults in the past when jit was enabled in postgres. After disabled it never happened again.
[https://tl.trimble.com/wp-content/uploads/2025/02/Trimble_Mobility_by_Platform_Science.png]
Willian Colognesi | Principal Engineer
[email protected]<mailto:[email protected]>
R. Adhemar Pereira de Barros, 1500, 14° floor, Londrina/PR | Brazil
On Thu, Oct 16, 2025, 18:54 Ishan Arunkumar Joshi <[email protected]<mailto:[email protected]>> wrote:
Hi team,
We are using PG16.9 in Patroni Postgres setup in production. Last night we have face an issue where postgres server got crash with “Segmentation fault” on table with auto vacuum task. Interestingly in standby node while we perform vacuum on same table standby node was also got crash. The table was not able to get query on the table as while executing select statement also crash the database.
we have observed few error prior to crash for same table. (Table name and function details change for purpose)
"ERROR : Error occurred at function get_details page 117 of relation ""impacted_table"" should be empty but is not"
During the same time the other table also getting below error. However once the database restart, we were not getting any issue for table oid= 1108029
“ERROR : Error occurred at function get_details unexpected data beyond EOF in block 16276 of relation base/33195/1108029"
At last it got failed as follows
2025-10-15 02:50:52.428 [432443]LOG: terminating any other active server processes"
2025-10-15 02:50:52.428 [432443]DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE schema.impacted_table"
2025-10-15 02:50:52.428 [432443]LOG: server process (PID 390906) was terminated by signal 11: Segmentation fault"
2025-10-15 02:50:55.475 [432443]LOG: all server processes terminated; reinitializing"
2025-10-15 02:51:32.575 [432443]LOG: received immediate shutdown request"
2025-10-15 02:51:32.629 [432443]LOG: database system is shut down"
The function which was executing having truncate table and insert/update statement executing on this table. As this is normally functionality but we are suspecting it during the execution runtime it corrupt the shared memory and data inflight which got replicated to Replica node/DR site and corrupt the same table.
We had to drop the table “impacted_table” from database once the database is up.
however we are not able to identify exact root cause behind “segmentation fault” error for this table and need expert advice to find the root case and also need suggestions to prevention steps.
Thanks & Regards,
-------------------------
Ishan Joshi
________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: postgres server crash with "Segmentation fault"
2025-10-16 11:07 postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
@ 2025-10-17 06:49 ` Laurenz Albe <[email protected]>
2025-10-17 11:14 ` RE: postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
2025-10-24 06:24 ` RE: postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
1 sibling, 2 replies; 6+ messages in thread
From: Laurenz Albe @ 2025-10-17 06:49 UTC (permalink / raw)
To: Ishan Arunkumar Joshi <[email protected]>; pgsql-admin
On Thu, 2025-10-16 at 11:07 +0000, Ishan Arunkumar Joshi wrote:
> We are using PG16.9 in Patroni Postgres setup in production. Last night we have face
> an issue [various data corruption errors]
>
> We had to drop the table “impacted_table” from database once the database is up.
>
> however we are not able to identify exact root cause behind “segmentation fault”
> error for this table and need expert advice to find the root case and also need
> suggestions to prevention steps.
By dropping the table, you have probably destroyed the evidence needed for that.
If you have a file system backup of the corrupted state, an expert might be able
to identify probable causes.
One prevention step would have been to run the latest minor release (currently 16.10).
Other than that, make sure that you have a good backup that is occasionally tested
and make sure that the backup is monitored (I have seen cases where the backup was a
daily pg_dump, and only when the corruption surfaced, people realized that the pg_dump
had been failing for the exact same reason...).
Yours,
Laurenz Albe
^ permalink raw reply [nested|flat] 6+ messages in thread
* RE: postgres server crash with "Segmentation fault"
2025-10-16 11:07 postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
2025-10-17 06:49 ` Re: postgres server crash with "Segmentation fault" Laurenz Albe <[email protected]>
@ 2025-10-17 11:14 ` Ishan Arunkumar Joshi <[email protected]>
1 sibling, 0 replies; 6+ messages in thread
From: Ishan Arunkumar Joshi @ 2025-10-17 11:14 UTC (permalink / raw)
To: Laurenz Albe <[email protected]>; pgsql-admin
Hi Laurenz,
>By dropping the table, you have probably destroyed the evidence needed for that.
>If you have a file system backup of the corrupted state, an expert might be able to identify probable causes.
>One prevention step would have been to run the latest minor release (currently 16.10).
>Other than that, make sure that you have a good backup that is occasionally tested and make sure that the backup is monitored (I have seen cases >where the backup was a daily pg_dump, and only when the corruption surfaced, people realized that the pg_dump had been failing for the exact same >reason...).
Thanks for your reply
We are using pgbackrest backup and incremental backup was taken it before. We have very large database (>25TB) and currently we do not have another server having same capacity to restore. However I am suspecting the corruption on the table was happen during some operation happening runtime which triggers the autovacuum and because the corruption happen already right before the restart, when autovacuum got lock to execute, it is getting segmentation fault and restart the postgres service.
I tried to execute select query on that same table but it failed with same error and hence I did not take risk of executing pg_dump as it may restart db server and impact the production and because it is temp table populating for some process, we took decision to drop the table and recreate.
We are still clueless how the table got corrupted and segmentation fault occurs.
I gone through old thread where it suggest to disable JIT but the same solution not worked hence I am not considering it as root cause and solution.
Thanks & Regards,
-------------------------
Ishan Joshi
-----Original Message-----
From: Laurenz Albe <[email protected]>
Sent: Friday, October 17, 2025 12:20 PM
To: Ishan Arunkumar Joshi <[email protected]>; [email protected]
Subject: Re: postgres server crash with "Segmentation fault"
[External Email]
________________________________
On Thu, 2025-10-16 at 11:07 +0000, Ishan Arunkumar Joshi wrote:
> We are using PG16.9 in Patroni Postgres setup in production. Last
> night we have face an issue [various data corruption errors]
>
> We had to drop the table “impacted_table” from database once the database is up.
>
> however we are not able to identify exact root cause behind “segmentation fault”
> error for this table and need expert advice to find the root case and
> also need suggestions to prevention steps.
By dropping the table, you have probably destroyed the evidence needed for that.
If you have a file system backup of the corrupted state, an expert might be able to identify probable causes.
One prevention step would have been to run the latest minor release (currently 16.10).
Other than that, make sure that you have a good backup that is occasionally tested and make sure that the backup is monitored (I have seen cases where the backup was a daily pg_dump, and only when the corruption surfaced, people realized that the pg_dump had been failing for the exact same reason...).
Yours,
Laurenz Albe
________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
^ permalink raw reply [nested|flat] 6+ messages in thread
* RE: postgres server crash with "Segmentation fault"
2025-10-16 11:07 postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
2025-10-17 06:49 ` Re: postgres server crash with "Segmentation fault" Laurenz Albe <[email protected]>
@ 2025-10-24 06:24 ` Ishan Arunkumar Joshi <[email protected]>
1 sibling, 0 replies; 6+ messages in thread
From: Ishan Arunkumar Joshi @ 2025-10-24 06:24 UTC (permalink / raw)
To: Laurenz Albe <[email protected]>; pgsql-admin
Hi Team,
Do we have any more suggestions or way forward for further investigation. Please suggest the same.
Thanks & Regards,
-------------------------
Ishan Joshi
-----Original Message-----
From: Ishan Arunkumar Joshi
Sent: Friday, October 17, 2025 4:45 PM
To: 'Laurenz Albe' <[email protected]>; [email protected]
Subject: RE: postgres server crash with "Segmentation fault"
Hi Laurenz,
>By dropping the table, you have probably destroyed the evidence needed for that.
>If you have a file system backup of the corrupted state, an expert might be able to identify probable causes.
>One prevention step would have been to run the latest minor release (currently 16.10).
>Other than that, make sure that you have a good backup that is occasionally tested and make sure that the backup is monitored (I have seen cases >where the backup was a daily pg_dump, and only when the corruption surfaced, people realized that the pg_dump had been failing for the exact same >reason...).
Thanks for your reply
We are using pgbackrest backup and incremental backup was taken it before. We have very large database (>25TB) and currently we do not have another server having same capacity to restore. However I am suspecting the corruption on the table was happen during some operation happening runtime which triggers the autovacuum and because the corruption happen already right before the restart, when autovacuum got lock to execute, it is getting segmentation fault and restart the postgres service.
I tried to execute select query on that same table but it failed with same error and hence I did not take risk of executing pg_dump as it may restart db server and impact the production and because it is temp table populating for some process, we took decision to drop the table and recreate.
We are still clueless how the table got corrupted and segmentation fault occurs.
I gone through old thread where it suggest to disable JIT but the same solution not worked hence I am not considering it as root cause and solution.
Thanks & Regards,
-------------------------
Ishan Joshi
-----Original Message-----
From: Laurenz Albe <[email protected]>
Sent: Friday, October 17, 2025 12:20 PM
To: Ishan Arunkumar Joshi <[email protected]>; [email protected]
Subject: Re: postgres server crash with "Segmentation fault"
[External Email]
________________________________
On Thu, 2025-10-16 at 11:07 +0000, Ishan Arunkumar Joshi wrote:
> We are using PG16.9 in Patroni Postgres setup in production. Last
> night we have face an issue [various data corruption errors]
>
> We had to drop the table “impacted_table” from database once the database is up.
>
> however we are not able to identify exact root cause behind “segmentation fault”
> error for this table and need expert advice to find the root case and
> also need suggestions to prevention steps.
By dropping the table, you have probably destroyed the evidence needed for that.
If you have a file system backup of the corrupted state, an expert might be able to identify probable causes.
One prevention step would have been to run the latest minor release (currently 16.10).
Other than that, make sure that you have a good backup that is occasionally tested and make sure that the backup is monitored (I have seen cases where the backup was a daily pg_dump, and only when the corruption surfaced, people realized that the pg_dump had been failing for the exact same reason...).
Yours,
Laurenz Albe
________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2025-10-24 06:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-10-16 11:07 postgres server crash with "Segmentation fault" Ishan Arunkumar Joshi <[email protected]>
2025-10-17 00:52 ` Willian Colognesi <[email protected]>
2025-10-17 11:03 ` Ishan Arunkumar Joshi <[email protected]>
2025-10-17 06:49 ` Laurenz Albe <[email protected]>
2025-10-17 11:14 ` Ishan Arunkumar Joshi <[email protected]>
2025-10-24 06:24 ` Ishan Arunkumar Joshi <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox