public inbox for [email protected]help / color / mirror / Atom feed
RE: Index (primary key) corrupt? 13+ messages / 3 participants [nested] [flat]
* RE: Index (primary key) corrupt? @ 2026-03-06 10:31 Wim Rouquart <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Wim Rouquart @ 2026-03-06 10:31 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; Greg Sabino Mullane <[email protected]>; +Cc: [email protected] <[email protected]> Internal > Alright, so the corrupt index is transferred by the binary pg_basebackup, but not in logical backups done via pg_dump/pg_restore. Correct > The issue then is on the source database with whatever process is corrupting the index and causing no error to be thrown when the table is dumped. "Whatever process is corrupting the index" -> I think this was more of a one-off incident that happened somewhere in the past, this is not a recurring issue. It's not like this index becomes corrupt again after I reindex it, to be clear. >Just to be clear we are talking about this table: >CREATE TABLE bcf_work_type ( > id bigserial NOT NULL, > aml_score int8 NOT NULL, > CONSTRAINT idx_376814_primary PRIMARY KEY (id) ); Yes > What is the use pattern for this table? Well, I understood from the devs it is basically no longer used currently, probably explaining why no issues have shown up until the datarefresh. > As I recall this is not a large table, but for completeness what is it's average size? It has only 12 rows 😊 > What are the Postgres log settings, on the source database, for?: log_error_verbosity = 'default' log_min_error_statement = 'fatal' log_min_messages = 'warning' log_statement = 'ddl' > Are there any entries in the Postgres log that reference this table? Nope, none found, probably because it isn't used anymore (and because of above log settings if it would be) -----Original Message----- From: Adrian Klaver <[email protected]> Sent: donderdag 5 maart 2026 17:05 To: Wim Rouquart <[email protected]>; Greg Sabino Mullane <[email protected]> Cc: [email protected] Subject: Re: Index (primary key) corrupt? The real sender of this external email is [email protected] On 3/5/26 5:23 AM, Wim Rouquart wrote: > Internal > >> So the REINDEX on the source PK is prompted by it not showing up on the target? > > That's how we noticed the initial issue yes, we got errors during the datarefresh on the target database where foreign keys wanted to reference the non-existing index (because it wasn't imported). > >> On the test instance you set up with pg_basebackup and that has the corrupted index what happens if you try to INSERT a record with a duplicate id? > > Good question, as I kind of expected, it doesn't complain at all when I do an insert with a duplicate id (and the row actually gets inserted). If I consecutively try to do the reindex, then I get the error that it can't because of doubles... > > > -----Original Message----- > From: Adrian Klaver <[email protected]> > Sent: vrijdag 13 februari 2026 18:33 > To: Wim Rouquart <[email protected]>; Greg Sabino Mullane > <[email protected]> > Cc: [email protected] > Subject: Re: Index (primary key) corrupt? > > > > The real sender of this external email is [email protected] > > > > > > > On 2/13/26 8:27 AM, Wim Rouquart wrote: >> Internal >> >> 1) ) It won't be included with the CREATE TABLE statement per:- >> >> Yes, let's keep it at: it's not in the dumpfile anywhere. >> >>> 2) The issue seems to be not the dump, but the non-functional state of the index on the source database. >> >>> Is there any indication of why that is happening? >> >> Not as far as I know. > > So the REINDEX on the source PK is prompted by it not showing up on the target? > >> >>> Also what error do you get on the source database that tells you the PK is not working? >> >> None, only noticed the issue because of the datarefresh to another instance where it turned out the primary key was not created in the target (because it was not in the dumpfile). > > On the test instance you set up with pg_basebackup and that has the corrupted index what happens if you try to INSERT a record with a duplicate id? Alright, so the corrupt index is transferred by the binary pg_basebackup, but not in logical backups done via pg_dump/pg_restore. The issue then is on the source database with whatever process is corrupting the index and causing no error to be thrown when the table is dumped. Just to be clear we are talking about this table: CREATE TABLE bcf_work_type ( id bigserial NOT NULL, aml_score int8 NOT NULL, CONSTRAINT idx_376814_primary PRIMARY KEY (id) ); What is the use pattern for this table? As I recall this is not a large table, but for completeness what is it's average size? What are the Postgres log settings, on the source database, for?: log_min_messages log_min_error_statement log_error_verbosity log_statement Are there any entries in the Postgres log that reference this table? > >> > >>> I hope this clears out any confusion. >>> >>> -----Original Message----- >> Adrian Klaver >> [email protected] >> >> Disclaimer <https://www.kbc.com/KBCmailDisclaimer; > > > Disclaimer <https://www.kbc.com/KBCmailDisclaimer; -- Adrian Klaver [email protected] Disclaimer <https://www.kbc.com/KBCmailDisclaimer; ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-06 16:03 Adrian Klaver <[email protected]> parent: Wim Rouquart <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Adrian Klaver @ 2026-03-06 16:03 UTC (permalink / raw) To: Wim Rouquart <[email protected]>; Greg Sabino Mullane <[email protected]>; +Cc: [email protected] <[email protected]> On 3/6/26 2:31 AM, Wim Rouquart wrote: > Internal > > >> As I recall this is not a large table, but for completeness what is it's average size? > > It has only 12 rows 😊 > >> What are the Postgres log settings, on the source database, for?: > > log_error_verbosity = 'default' > log_min_error_statement = 'fatal' > log_min_messages = 'warning' > log_statement = 'ddl' > >> Are there any entries in the Postgres log that reference this table? > > Nope, none found, probably because it isn't used anymore (and because of above log settings if it would be) > Well at this point I don't see that it is possible to determine a cause for why the index does not transfer or throw an error in the process. -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 13+ messages in thread
* RE: Index (primary key) corrupt? @ 2026-03-09 14:12 Wim Rouquart <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 2 replies; 13+ messages in thread From: Wim Rouquart @ 2026-03-09 14:12 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; Greg Sabino Mullane <[email protected]>; +Cc: [email protected] <[email protected]> Internal I already saw finding the actual cause as a 'lost cause' as these things tend to happen, however what bothers me most is that a tool like amcheck which is supposed to find corruption also shows up with no result. -----Original Message----- From: Adrian Klaver <[email protected]> Sent: vrijdag 6 maart 2026 17:03 To: Wim Rouquart <[email protected]>; Greg Sabino Mullane <[email protected]> Cc: [email protected] Subject: Re: Index (primary key) corrupt? The real sender of this external email is [email protected] On 3/6/26 2:31 AM, Wim Rouquart wrote: > Internal > > >> As I recall this is not a large table, but for completeness what is it's average size? > > It has only 12 rows 😊 > >> What are the Postgres log settings, on the source database, for?: > > log_error_verbosity = 'default' > log_min_error_statement = 'fatal' > log_min_messages = 'warning' > log_statement = 'ddl' > >> Are there any entries in the Postgres log that reference this table? > > Nope, none found, probably because it isn't used anymore (and because > of above log settings if it would be) > Well at this point I don't see that it is possible to determine a cause for why the index does not transfer or throw an error in the process. -- Adrian Klaver [email protected] Disclaimer <https://www.kbc.com/KBCmailDisclaimer; ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-09 14:52 Adrian Klaver <[email protected]> parent: Wim Rouquart <[email protected]> 1 sibling, 0 replies; 13+ messages in thread From: Adrian Klaver @ 2026-03-09 14:52 UTC (permalink / raw) To: Wim Rouquart <[email protected]>; Greg Sabino Mullane <[email protected]>; +Cc: [email protected] <[email protected]> On 3/9/26 7:12 AM, Wim Rouquart wrote: > Internal > > I already saw finding the actual cause as a 'lost cause' as these things tend to happen, however what bothers me most is that a tool like amcheck which is supposed to find corruption also shows up with no result. That assumes the corruption is something that amcheck checks for. -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-09 15:24 Greg Sabino Mullane <[email protected]> parent: Wim Rouquart <[email protected]> 1 sibling, 1 reply; 13+ messages in thread From: Greg Sabino Mullane @ 2026-03-09 15:24 UTC (permalink / raw) To: Wim Rouquart <[email protected]>; +Cc: Adrian Klaver <[email protected]>; [email protected] <[email protected]> On Mon, Mar 9, 2026 at 10:12 AM Wim Rouquart <[email protected]> wrote: > I already saw finding the actual cause as a 'lost cause' as these things > tend to happen, however what bothers me most is that a tool like amcheck > which is supposed to find corruption also shows up with no result. > Well, no, these things really should not happen. :) It may be too late, but it would be real interesting to see this query both before and after the REINDEX: select * from pg_index where indrelid = 'bcf_work_type'::regclass and indisprimary; An incorrect indrelid is one way I can think of as to how pg_dump would miss it, but that wouldn't explain why reindex would subsequently fix it. Cheers, Greg ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-09 15:37 Adrian Klaver <[email protected]> parent: Greg Sabino Mullane <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Adrian Klaver @ 2026-03-09 15:37 UTC (permalink / raw) To: Greg Sabino Mullane <[email protected]>; Wim Rouquart <[email protected]>; +Cc: [email protected] <[email protected]> On 3/9/26 8:24 AM, Greg Sabino Mullane wrote: > On Mon, Mar 9, 2026 at 10:12 AM Wim Rouquart <[email protected] > <mailto:[email protected]>> wrote: > > I already saw finding the actual cause as a 'lost cause' as these > things tend to happen, however what bothers me most is that a tool > like amcheck which is supposed to find corruption also shows up with > no result. > > > Well, no, these things really should not happen. :) > > It may be too late, but it would be real interesting to see this query > both before and after the REINDEX: > > select * from pg_index where indrelid  = 'bcf_work_type'::regclass and > indisprimary; Déjà vu :) https://www.postgresql.org/message-id/CAKAnmmK9uKAcerhseNg6FSDOnMWmivM5ctUiTAdc1kobq94Dqw%40mail.gma... This post in answer: https://www.postgresql.org/message-id/AS2PR05MB107549DDE42DC0B8E31CB52BFEF90A%40AS2PR05MB10754.eurpr... would seem to indicate that is not the issue. > > An incorrect indrelid is one way I can think of as to how pg_dump would > miss it, but that wouldn't explain why reindex would subsequently fix it. > > Cheers, > Greg > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-09 17:53 Greg Sabino Mullane <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Greg Sabino Mullane @ 2026-03-09 17:53 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: Wim Rouquart <[email protected]>; [email protected] <[email protected]> On Mon, Mar 9, 2026 at 11:37 AM Adrian Klaver <[email protected]> wrote: > This post in answer: > > > https://www.postgresql.org/message-id/AS2PR05MB107549DDE42DC0B8E31CB52BFEF90A%40AS2PR05MB10754.eurpr... > > would seem to indicate that is not the issue. > Ah, but those are doing the lookup by the index (via indexrelid) but I am curious about looking up by the table (indrelid), as that is how pg_dump is going to get at it. Cheers, Greg ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-09 19:52 Adrian Klaver <[email protected]> parent: Greg Sabino Mullane <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Adrian Klaver @ 2026-03-09 19:52 UTC (permalink / raw) To: Greg Sabino Mullane <[email protected]>; +Cc: Wim Rouquart <[email protected]>; [email protected] <[email protected]> On 3/9/26 10:53 AM, Greg Sabino Mullane wrote: > On Mon, Mar 9, 2026 at 11:37 AM Adrian Klaver <[email protected] > <mailto:[email protected]>> wrote: > > This post in answer: > > https://www.postgresql.org/message-id/ > AS2PR05MB107549DDE42DC0B8E31CB52BFEF90A%40AS2PR05MB10754.eurprd05.prod.outlook.com <https://www.postgresql.org/message-id/AS2PR05MB107549DDE42DC0B8E31CB52BFEF90A%40AS2PR05MB10754.eurpr...; > > would seem to indicate that is not the issue. > > > Ah, but those are doing the lookup by the index (via indexrelid) but I > am curious about looking up by the table (indrelid), as that is how > pg_dump is going to get at it. Yeah, but the indrelid did not change after the the REINDEX. I guess one could question the cast in: indrelid = 'bcf_work_type'::regclass Though I am not sure how a REINDEX would affect that? > Cheers, > Greg > > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-10 14:15 Greg Sabino Mullane <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Greg Sabino Mullane @ 2026-03-10 14:15 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: Wim Rouquart <[email protected]>; [email protected] <[email protected]> On Mon, Mar 9, 2026 at 3:53 PM Adrian Klaver <[email protected]> wrote: > Yeah, but the indrelid did not change after the the REINDEX. > Agreed, but none of this makes sense. pg_dump grabs pks via a bulk pg_index scan based on table oids, so I wanted to rule out some problem there. I'm chalking this one up to user error, not database corruption, as the OP has not actually shown us the output of how they are determining the missing index, and then how the reindex fixes it (although I appreciate the pg_index query results). To put another way, user error is a much more likely explanation than anything else given the lack of specific data. Cheers, Greg ^ permalink raw reply [nested|flat] 13+ messages in thread
* RE: Index (primary key) corrupt? @ 2026-03-10 14:24 Wim Rouquart <[email protected]> parent: Greg Sabino Mullane <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Wim Rouquart @ 2026-03-10 14:24 UTC (permalink / raw) To: Greg Sabino Mullane <[email protected]>; Adrian Klaver <[email protected]>; +Cc: [email protected] <[email protected]> Internal Let me get this straight, are you still contesting that the index is actually not part of the dumpfile and I somehow just keep on ‘missing it’? From: Greg Sabino Mullane <[email protected]> Sent: dinsdag 10 maart 2026 15:15 To: Adrian Klaver <[email protected]> Cc: Wim Rouquart <[email protected]>; [email protected] Subject: Re: Index (primary key) corrupt? The real sender of this external email is [email protected]<mailto:[email protected]> On Mon, Mar 9, 2026 at 3:53 PM Adrian Klaver <[email protected]<mailto:[email protected]>> wrote: Yeah, but the indrelid did not change after the the REINDEX. Agreed, but none of this makes sense. pg_dump grabs pks via a bulk pg_index scan based on table oids, so I wanted to rule out some problem there. I'm chalking this one up to user error, not database corruption, as the OP has not actually shown us the output of how they are determining the missing index, and then how the reindex fixes it (although I appreciate the pg_index query results). To put another way, user error is a much more likely explanation than anything else given the lack of specific data. Cheers, Greg Disclaimer <https://www.kbc.com/KBCmailDisclaimer; ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-10 14:43 Greg Sabino Mullane <[email protected]> parent: Wim Rouquart <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Greg Sabino Mullane @ 2026-03-10 14:43 UTC (permalink / raw) To: Wim Rouquart <[email protected]>; +Cc: Adrian Klaver <[email protected]>; [email protected] <[email protected]> On Tue, Mar 10, 2026 at 10:24 AM Wim Rouquart <[email protected]> wrote: > Let me get this straight, are you still contesting that the index is > actually not part of the dumpfile and I somehow just keep on ‘missing it’? > That is one possibility, yes, but there are others. We just don't have enough data. It would be great to see exactly what pg_dump is doing so we know where the corruption/disconnect is. If you have access, could you try: psql -c "alter system set log_statement='all' " -c "select pg_reload_conf()" pg_dump -t bcf_work_type --schema-only > bcf.debug psql -c "alter system reset log_statement" -c "select pg_reload_conf()" Then send us bcf.debug as well as the Postgres logs generated during that request? Cheers, Greg ^ permalink raw reply [nested|flat] 13+ messages in thread
* Re: Index (primary key) corrupt? @ 2026-03-10 15:09 Adrian Klaver <[email protected]> parent: Greg Sabino Mullane <[email protected]> 0 siblings, 1 reply; 13+ messages in thread From: Adrian Klaver @ 2026-03-10 15:09 UTC (permalink / raw) To: Greg Sabino Mullane <[email protected]>; Wim Rouquart <[email protected]>; +Cc: [email protected] <[email protected]> On 3/10/26 7:43 AM, Greg Sabino Mullane wrote: > On Tue, Mar 10, 2026 at 10:24 AM Wim Rouquart <[email protected] > <mailto:[email protected]>> wrote: > > Let me get this straight, are you still contesting that the index is > actually not part of the dumpfile and I somehow just keep on > ‘missing it’? > > That is one possibility, yes, but there are others. We just don't have > enough data. It would be great to see exactly what pg_dump is doing so > we know where the corruption/disconnect is. If you have access, could > you try: I am convinced that the index definition is not in the pg_dump output. The crux of the matter seems to be from here: https://www.postgresql.org/message-id/78328b08-249e-4251-8a10-b5dac183442a%40aklaver.com "Alright, so the corrupt index is transferred by the binary pg_basebackup, but not in logical backups done via pg_dump/pg_restore. The issue then is on the source database with whatever process is corrupting the index and causing no error to be thrown when the table is dumped." Where the pg_basebackup was done from the production database in order to set up a test database and the logical dumps where done from the test database. Hopefully the below will tease that out. > > psql -c "alter system set log_statement='all' " -c "select pg_reload_conf()" > > pg_dump -t bcf_work_type --schema-only > bcf.debug > > psql -c "alter system reset log_statement" -c "select pg_reload_conf()" > > Then send us bcf.debug as well as the Postgres logs generated during > that request? > > Cheers, > Greg > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 13+ messages in thread
* RE: Index (primary key) corrupt? @ 2026-04-09 08:31 Wim Rouquart <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 0 replies; 13+ messages in thread From: Wim Rouquart @ 2026-04-09 08:31 UTC (permalink / raw) To: Greg Sabino Mullane <[email protected]>; [email protected] <[email protected]>; +Cc: Adrian Klaver <[email protected]> Internal OK, Took the mailing list out of the loop because some logging needed to be provided. Reincluding it now because the actual issue has been identified. Putting the list back in so if anyone bumps into this one in the future, it might prove helpful. See Greg’s answer below (the relhasindex column of the pg_class dictionary table was indeed set to ‘false’ for that index, what caused this situation we will probably never find out). Thanks to Greg and Adrian for staying on top of this one. Cheers, Wim. From: Greg Sabino Mullane <[email protected]> Sent: woensdag 18 maart 2026 17:42 To: Wim Rouquart <[email protected]> Cc: Adrian Klaver <[email protected]> Subject: Re: Index (primary key) corrupt? The real sender of this external email is [email protected]<mailto:[email protected]> Ok, I can see the exact problem now in the logs. There is a section of pg_dump that tries to find the indexes, and it builds a list of oids to scan while doing so. In the logs you sent the unnest is an empty value '{}'. It should be '{1998823}'. Indeed, it shows up in other places properly, so there is something unique to that section of code. Looking deeper, that can fail to populated either because the table is "not interesting" (but it surely is by the other queries) or because it has no index. That in turn traces back to the relhasindex column of pg_class. My guess is that something has set that to 'f' when it should be 't'. Not the kind of corruption that can be detected by checksums or by anything like amcheck, as I'm pretty sure the index is there, but pg_class has been corrupted in a way to not allow pg_dump to find it. (New email arrives). Yep - the post reindex one now has the oid in the unnest, which suggests relhasindex was flipped back. Can you verify this theory with a: SELECT * FROM pg_class WHERE oid = 1998823; See if relhasindex is true of false. Is there any chance some tool/person has modified that field, perhaps as some sort of hack to temporarily disable indexes? You can also run this global sanity check. It should return no rows: select relname from pg_class c where relhasindex is false and exists (select 1 from pg_index where c.oid = indrelid); That problem table can be fixed without a REINDEX by simply doing: update pg_class set relhasinex=true where oid = 1998823; Disclaimer <https://www.kbc.com/KBCmailDisclaimer; ^ permalink raw reply [nested|flat] 13+ messages in thread
end of thread, other threads:[~2026-04-09 08:31 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-03-06 10:31 RE: Index (primary key) corrupt? Wim Rouquart <[email protected]> 2026-03-06 16:03 ` Adrian Klaver <[email protected]> 2026-03-09 14:12 ` Wim Rouquart <[email protected]> 2026-03-09 14:52 ` Adrian Klaver <[email protected]> 2026-03-09 15:24 ` Greg Sabino Mullane <[email protected]> 2026-03-09 15:37 ` Adrian Klaver <[email protected]> 2026-03-09 17:53 ` Greg Sabino Mullane <[email protected]> 2026-03-09 19:52 ` Adrian Klaver <[email protected]> 2026-03-10 14:15 ` Greg Sabino Mullane <[email protected]> 2026-03-10 14:24 ` Wim Rouquart <[email protected]> 2026-03-10 14:43 ` Greg Sabino Mullane <[email protected]> 2026-03-10 15:09 ` Adrian Klaver <[email protected]> 2026-04-09 08:31 ` Wim Rouquart <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox