Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uXgnH-00DD4X-QO for pgsql-general@arkaria.postgresql.org; Fri, 04 Jul 2025 13:49:36 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uXgnF-000uzc-2K for pgsql-general@arkaria.postgresql.org; Fri, 04 Jul 2025 13:49:33 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uXgnE-000uzU-Jk for pgsql-general@lists.postgresql.org; Fri, 04 Jul 2025 13:49:33 +0000 Received: from mail-ej1-x62f.google.com ([2a00:1450:4864:20::62f]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uXgnD-005hnI-0e for pgsql-general@lists.postgresql.org; Fri, 04 Jul 2025 13:49:32 +0000 Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-ae0bc7aa21bso201408166b.2 for ; Fri, 04 Jul 2025 06:49:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=element.io; s=google; t=1751636968; x=1752241768; darn=lists.postgresql.org; h=content-language:in-reply-to:from:references:to:subject:user-agent :mime-version:date:message-id:from:to:cc:subject:date:message-id :reply-to; bh=LzLPvaPiFR5pHKbbf/b1BZBdYuHyYp2/HVHUeGTDLHM=; b=f0Q8FJZP0H0Iop19N5UanP0U1TW52cqFQJvjwZFBZ2EXbQMWJXdBLlZ5JM6bXYeYIJ mc+Ihvw1Ybw8VzTlapbsXXjZ6gMUECdHaIeSrNzf9tTM61nSkxQIkeWday3bSdGC36Wt x0LwsavsLi9Ev6yCi6xLBtLm4MUPMaH/pLxG94eAn7ThSkm9YP8kzi3+R545uS6QgX6S FOqwGYol6UJR2W++bs8kGDAHsw2NnVu7IXrZwEclCbosueqjLdJUGGUtqLF7ZSZkh2pL qbj2VDEKYytfz50m77mysuBY3jMm6HtZ1qOsAV7Gj06R9cZDXdHqjLc21PAQV0j7oqPx B/Lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751636968; x=1752241768; h=content-language:in-reply-to:from:references:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=LzLPvaPiFR5pHKbbf/b1BZBdYuHyYp2/HVHUeGTDLHM=; b=N3ZKtJM3QJG1kHP+5kkEzjC0FgrqBXjvbCfB9RCLK4Nis43aaA3NILP/fA8nOuAF8y m6oMJ2MGPUBLQ54/8jsIOGgNGL9c+IZqct4FWXGaILd6f7n9J6Od4L2y7nDvUxtHtoaG g//yAZPDXzh7WyZ1h94Kk/LjBrGVoT9zbS+V08kfScAoxX4+IISv7lxrv40qmsKpGahS GeXw3D3bY8dB5W+b9ofRthyx4ymb974rw1g3CDyhbMFAgoN+GZw5oSdhJjH9SCOYjYJU gJ6MKkg7HKnzjE6gVkxwMvOLT7P4pj+FSOKSHiHcSyZuPrBpVL3pvIBD2ntREU7nry2F 5H7Q== X-Gm-Message-State: AOJu0Yx0QZ4AUnVlaz6cU6pbaHHcsXLTMDr5hRl0H9IlwJeNXcsx3YJ+ doj6BI6L4R0hdIOEdTIaTWSSzkvDEgCfx9iNHQPBq1BYM43NEBMogShttKyPLc4a2CBxDMki5Ud Ihhmj9c4wY+cwlCKIs3oXiJF9ffcXpB+K9CedlvMf4h++NEDL08Qk0XqdWlI0hhyfdxvIGHRIfJ uEBEyvFQ== X-Gm-Gg: ASbGncuN4keUpiikYpG6vtycIYyBq3U2zvkEZzyFocymWkHH3F2AZEoNkBPGFTvj3Rs EZgmtsKfTvJ9aVRz+XciE+/NxfzJ0ftirn8g7zAirIVg/MAvy3pYNxavEbJyQ3BW7N7y2BvNNe/ +Xehe2sODE8enw5456R59V732tu6/N5HUkIEXI5/mEFQYmZbSmeV/Q8OQPhMMeRKYwwHMD7wK0Q 8xBbOsrVex0mcIXAX2zPyLNpqhecrAfaZ1wWmyZ7br8GgPb/Oj9tuNj39Jjb4xKNPUyO0mMmJxl sg6VFmHEXzerICC/Q3gCviBWt20fnUIUrs+z4Al/aSAOPnA9wKR34WfwF4Q19HNU X-Google-Smtp-Source: AGHT+IEYGrZ1JUpiYPKilwehYqn+1dTKM4zjizlHnaR4F7B8izsbQJ78zj7BJr8FByyNQFJp6ZO+cw== X-Received: by 2002:a17:907:6ea5:b0:ae3:5e2d:a8bf with SMTP id a640c23a62f3a-ae3fe5bde90mr216792366b.14.1751636968162; Fri, 04 Jul 2025 06:49:28 -0700 (PDT) Received: from [192.168.68.73] ([217.155.131.108]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ae3f66d942bsm177014666b.24.2025.07.04.06.49.27 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 04 Jul 2025 06:49:27 -0700 (PDT) Message-ID: <01df0ca4-cf32-448f-a2cb-e44a8e9f7fd6@element.io> Date: Fri, 4 Jul 2025 14:49:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Corrupt btree index includes rows that don't match To: pgsql-general@lists.postgresql.org References: From: Erik Johnston In-Reply-To: Content-Type: multipart/alternative; boundary="------------866EKVpxmTft7dDNXQA0xt9t" Content-Language: en-GB List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------866EKVpxmTft7dDNXQA0xt9t Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable Hi, a quick update: - We have discovered that the corruption was present from before libicu=20 update. - We ran `pg_amcheck --index state_groups_state_type_idx=20 --heapallindexed matrix`, which returned nothing - We believe that means that (and matches what we see sampling) the=20 index has gained extra entries, i.e. that for a given state group it=20 does return all the relevant rows in the table /plus/ extra rows. We are also seeing old state groups starting to point at rows that have=20 only just been inserted. For example, querying for 353864583 on the=20 primary it returns that row plus four rows that have been inserted=20 today, but on the backup from last week an index only scan for 353864583=20 only returns one row. This makes it feel like the corruption is ongoing?=20 Nothing should have modified that state group in the interim (they are=20 generally immutable). This naively feels like when inserting a new row we sometimes add the=20 row to the index twice: once pointing from the correct state group to=20 the new row, and once from an old state group to the new row? Thanks, Erik On 03/07/2025 18:07, Erik Johnston wrote: > > Hello, > > > We're looking into a problem with our application and have tracked it=20 > down to index corruption, whereby we have many index rows pointing to=20 > the wrong tuples in the heap. > > > Our table looks like: > > > ``` > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Table "matrix.state_groups_state= " > =C2=A0 =C2=A0Column =C2=A0 =C2=A0| =C2=A0Type =C2=A0| Collation | Nullabl= e | Default > -------------+--------+-----------+----------+--------- > =C2=A0state_group | bigint | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0| > =C2=A0room_id =C2=A0 =C2=A0 | text =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| > =C2=A0type =C2=A0 =C2=A0 =C2=A0 =C2=A0| text =C2=A0 | =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| > =C2=A0state_key =C2=A0 | text =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| > =C2=A0event_id =C2=A0 =C2=A0| text =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| > Indexes: > =C2=A0 =C2=A0 "state_groups_state_room_id_idx" brin (room_id) WITH=20 > (pages_per_range=3D'1') > =C2=A0 =C2=A0 "state_groups_state_type_idx" btree (state_group, type,=20 > state_key), tablespace "postgres_second" > Triggers: > =C2=A0 =C2=A0 check_state_groups_state_deletion_trigger AFTER DELETE ON= =20 > state_groups_state DEFERRABLE INITIALLY DEFERRED FOR EACH ROW EXECUTE=20 > FUNCTION check_state_groups_state_deletion() > > ``` > > > The symptoms we are noticing are that a DELETE or SELECT query=20 > includes rows that don't match the condition, as long as we issue a=20 > query that results in an Index Scan (not Index Only Scan): > > > For example, including `ctid` in the query is enough to make the=20 > planner use an Index Scan: > > > ``` > > SELECT ctid, state_group FROM state_groups_state WHERE state_group =3D=20 > 483128098; > > =C2=A0 =C2=A0 =C2=A0 ctid =C2=A0 =C2=A0 =C2=A0| state_group > ----------------+------------- > =C2=A0(16669607,1) =C2=A0 | =C2=A0 483128098 > =C2=A0(424940858,20) | =C2=A0 963361875 > =C2=A0(16669606,53) =C2=A0| =C2=A0 483128098 > (3 rows) > > ``` > > > But with an Index Only Scan: > > > ``` > > SELECT state_group FROM state_groups_state WHERE state_group =3D 48312809= 8; > =C2=A0state_group > ------------- > =C2=A0 =C2=A0483128098 > =C2=A0 =C2=A0483128098 > =C2=A0 =C2=A0483128098 > (3 rows) > > ``` > > > Since including `ctid` in the SELECT columns causes the query to use=20 > an Index Scan (fetching tuples from the heap), this inconsistency=20 > leads us to believe that our index and heap disagree. > > > Forcing a sequential scan with that same query only returns two rows=20 > matching that state group, which suggests that the index thinks there=20 > are more rows in the table than there actually are. (We do not believe=20 > anything can have deleted a row with state group 483128098). Also=20 > interestingly, querying (with the index re-enabled) for 963361875=20 > returns the same row as returned above, so the row is in the index twice. > > > Another example state group (147961623) should only have a single row=20 > associated with it, and yet the index returns nearly 7000 rows=20 > (including the one we expect). The unexpected state groups are all in=20 > the range 794390760=E2=80=93794393085 (except one in 794411694), and also= have=20 > ctids in range (93454823,48) =E2=80=93 (93455621,49). The fact that these= are=20 > reasonably tight ranges feels suspicious. Note that the state group is=20 > a simple incrementing ID here. > > > This table is quite large (about 6 TB) but we have sampled a few small=20 > ranges of it and found many instances of this type of corruption, in=20 > the first (approximate) half of the key range (0..561M out of 0..1034M). > > > For historical reasons, the table and the index are on different=20 > tablespaces, but the same filesystem. > > > We have sampled the table on our secondary server, and we see the same=20 > sort of corruption going on (though given the size of the data we=20 > don=E2=80=99t actually know if it's exactly the same). > > > One coincidence is that we started seeing the first symptoms of this=20 > around the same time as libicu was updated with a security patch.=20 > However, postgres hasn=E2=80=99t been restarted and doesn=E2=80=99t refer= ence the new=20 > version in its process maps. Plus state groups are integers anyway. We=20 > also use the C locale, not ICU. > > > We=E2=80=99re currently running =E2=80=9Cpg_amcheck --index=20 > state_groups_state_type_idx --heapallindexed=E2=80=9D on our secondary to= see=20 > what it says, but we expect that to take a long time to complete. > > > Thankfully, we have database backups so hopefullywe should be able to=20 > restore the data. However, any thoughts on how this happened or where=20 > to look next would be greatly appreciated. Thoughts on how to check=20 > our other indexes for corruption would also be very welcome. > > > Thanks, > > Erik > > > > Further details of our setup: > > * 2 servers in physical replication (one primary, one secondary as a > hot standby) > o both servers display the corruption > * ECC RAM > * 8 NVME SSD, raid10 (mdraid), LVM, ext4 filesystem. > o smartctl and mdadm report healthy disks > * Debian, postgres installed via apt. > * Postgres version: PostgreSQL 14.11 (Debian 14.11-1.pgdg120+1) on > x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bi= t > * Kernel version 6.1.0-22-amd64, GLIBC 2.36-9+deb12u10 > > > > Copyright =C2=A9 2025 Element - All rights reserved. The Element name, lo= go=20 > and device are registered trademarks of New Vector Ltd. Registered=20 > number: 10873661. Registered in England and Wales. Registered address:=20 > 10 Queen Street Place, London, United Kingdom, EC4R 1AG. > > This message is intended for the addressee only and may contain=20 > private and confidential information or material which may be=20 > privileged. If this message has come to you in error please delete it=20 > immediately and do not copy it or show it to any other person. > --=20 Copyright =C2=A9 2025 Element - All rights reserved. The Element name, logo= =20 and device are registered trademarks of New Vector Ltd. Registered number:= =20 10873661. Registered in England and Wales. Registered address: 10 Queen=20 Street Place, London, United Kingdom, EC4R 1AG. This message is intended=20 for the addressee only and may contain private and confidential information= =20 or material which may be privileged. If this message has come to you in=20 error please delete it immediately and do not copy it or show it to any=20 other person. --------------866EKVpxmTft7dDNXQA0xt9t Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Hi, a quick update:

- We have discovered that the corruption was present from before libicu update.
- We ran `pg_amcheck --index state_groups_state_type_idx --heapallindexed matrix`, which returned nothing
- We believe that means that (and matches what we see sampling) the index has gained extra entries, i.e. that for a given state group it does return all the relevant rows in the table plus extra rows.

We are also seeing old state groups starting to point at rows that have only just been inserted. For example, querying for 353864583 on the primary it returns that row plus four rows that have been inserted today, but on the backup from last week an index only scan for 353864583 only returns one row. This makes it feel like the corruption is ongoing? Nothing should have modified that state group in the interim (they are generally immutable).

This naively feels like when inserting a new row we sometimes add the row to the index twice: once pointing from the correct state group to the new row, and once from an old state group to the new row?


Thanks,
Erik

On 03/07/2025 18:07, Erik Johnston wrote:

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">Hello,


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">We're looking into a problem with our application and have tracked it down to index corruption, whereby we have many index rows pointing to the wrong tuples in the heap.


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">Our table looks like:


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">```<= /p>

<= font face=3D"monospace">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0Table "matrix.state_groups_state"
=C2=A0 =C2=A0Column =C2=A0 =C2=A0| =C2=A0Type =C2=A0| Colla= tion | Nullable | Default
-------------+--------+-----------+----------+---------
=C2=A0state_group | bigint | =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|
=C2=A0room_id =C2=A0 =C2=A0 | text =C2=A0 | =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|
=C2=A0type =C2=A0 =C2=A0 =C2=A0 =C2=A0| text =C2=A0 | =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|
=C2=A0state_key =C2=A0 | text =C2=A0 | =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|
=C2=A0event_id =C2=A0 =C2=A0| text =C2=A0 | =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|
Indexes:
=C2=A0 =C2=A0 "state_groups_state_room_id_idx" brin (room_i= d) WITH (pages_per_range=3D'1')
=C2=A0 =C2=A0 "state_groups_state_type_idx" btree (state_gr= oup, type, state_key), tablespace "postgres_second"
Triggers:
=C2=A0 =C2=A0 check_state_groups_state_deletion_trigger AFT= ER DELETE ON state_groups_state DEFERRABLE INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION check_state_groups_state_deletion()

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">```<= /p>

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">The symptoms we are noticing are that a DELETE or SELECT query includes rows that don't match the condition, as long as we issue a query that results in an Index Scan (not Index Only Scan):


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">For example, including `ctid` in the query is enough to make the planner use an Index Scan:


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">```<= /p>

<= font face=3D"monospace">SELECT ctid, state_group FROM state_groups_state WHERE state_group =3D 483128098;

=C2=A0 =C2=A0 =C2=A0 ctid =C2=A0 =C2=A0 =C2=A0| state_group=
----------------+-------------
=C2=A0(16669607,1) =C2=A0 | =C2=A0 483128098
=C2=A0(424940858,20) | =C2=A0 963361875
=C2=A0(16669606,53) =C2=A0| =C2=A0 483128098
(3 rows)

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">```<= /p>

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">But with an Index Only Scan:


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">```<= /p>

<= font face=3D"monospace">SELECT state_group FROM state_groups_state WHERE state_group =3D 483128098;
=C2=A0state_group
-------------
=C2=A0 =C2=A0483128098
=C2=A0 =C2=A0483128098
=C2=A0 =C2=A0483128098
(3 rows)

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">```<= /p>

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">Since including `ctid` in the SELECT columns causes the query to use an Index Scan (fetching tuples from the heap), this inconsistency leads us to believe that our index and heap disagree.


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">Forcing a sequential scan with that same query only returns two rows matching that state group, which suggests that the index thinks there are more rows in the table than there actually are. (We do not believe anything can have deleted a row with state group 483128098). Also interestingly, querying (with the index re-enabled) for 963361875 returns the same row as returned above, so the row is in the index twice.


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">Another example state group (147961623) should only have a single row associated with it, and yet the index returns nearly 7000 rows (including the one we expect). The unexpected state groups are all in the range 794390760=E2=80=93794393085 (except one in 794411694), and = also have ctids in range (93454823,48) =E2=80=93 (93455621,49). = The fact that these are reasonably tight ranges feels suspicious. Note that the state group is a simple incrementing ID here.


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">This table is quite large (about 6 TB) but we have sampled a few small ranges of it and found many instances of this type of corruption, in the first (approximate) half of the key range (0..561M out of 0..1034M).


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">For historical reasons, the table and the index are on different tablespaces, but the same filesystem.


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">We have sampled the table on our secondary server, and we see the same sort of corruption going on (though given the size of the data we don=E2=80=99t actually know if it's exactly the same).


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">One coincidence is that we started seeing the first symptoms of this around the same time as libicu was updated with a security patch. However, postgres hasn=E2=80=99t been restarted and doesn=E2=80=99t reference the new version in = its process maps. Plus state groups are integers anyway. We also use the C locale, not ICU.


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">We=E2=80=99= re currently running =E2=80=9Cpg_amcheck --index state_groups_state_type_idx --heapallindexed=E2=80=9D on ou= r secondary to see what it says, but we expect that to take a long time to complete.


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">Thankfully, we have database backups so hopefully we should be able to restore the data. However, any thoughts on how this happened or where to look next would be greatly appreciated. Thoughts on how to check our other indexes for corruption would also be very welcome.


<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">Thanks,

= Erik=

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">

<= span style=3D"font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);backg= round-color:transparent;font-variant-numeric:normal;font-variant-east-asian= :normal;font-variant-alternates:normal;vertical-align:baseline">Further details of our setup:

  • 2 servers in physical replication (one primary, one secondary as a hot standby)
    • both servers display the corruption
  • ECC RAM
  • 8 NVME SSD, raid10 (mdraid), LVM, ext4 filesystem.
    • smartctl and mdadm report healthy disks
  • Debian, postgres installed via apt.
  • Postgres version: PostgreSQL 14.11 (Debian 14.11-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
  • Kernel version 6.1.0-22-amd64, GLIBC 2.36-9+deb12u10


Copyright =C2=A9 2025 Element - All rights reserved. The Element name= , logo and device are registered trademarks of New Vector Ltd. Registered number: 10873661. Registered in England and Wales. Registered address: 10 Queen Street Place, London, United Kingdom, EC4R 1AG.

This message is intended for the addressee only and may contain private and confidential information or material which may be privileged. If this message has come to you in error please delete it immediately and do not copy it or show it to any other person.=



Copyright =C2=A9 2025 Element - All rights reserved. The El= ement name, logo and device are registered trademarks of New Vector Ltd. Re= gistered number: 10873661. Registered in England and Wales. Registered addr= ess: 10 Queen Street Place, London, United Kingdom, EC4R 1AG.=

This message is intended for the addresse= e only and may contain private and confidential information or material whi= ch may be privileged. If this message has come to you in error please delet= e it immediately and do not copy it or show it to any other person.<= /span>

--------------866EKVpxmTft7dDNXQA0xt9t--