Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uHfR0-0074wo-AU for pgsql-admin@arkaria.postgresql.org; Wed, 21 May 2025 09:08:22 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uHfQ0-004uLz-7m for pgsql-admin@arkaria.postgresql.org; Wed, 21 May 2025 09:07:20 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uHfPz-004uLr-Nl for pgsql-admin@lists.postgresql.org; Wed, 21 May 2025 09:07:19 +0000 Received: from mail-yb1-xb32.google.com ([2607:f8b0:4864:20::b32]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uHfPx-0006Zc-07 for pgsql-admin@lists.postgresql.org; Wed, 21 May 2025 09:07:19 +0000 Received: by mail-yb1-xb32.google.com with SMTP id 3f1490d57ef6-e7b811be778so4246816276.2 for ; Wed, 21 May 2025 02:07:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747818435; x=1748423235; darn=lists.postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=DeNYPdomACHa8wDhiFaUfgnQyYUfpnrLdwpGjvcnAOM=; b=K3OVPYgM7vAE7ywqY7QnvQNu3AL4+NgVLhw3CQ1ru8l1Il7wFdHHYCeik2+Ys51UMx HRcg7gt2WVMRAAQsMrpDDzKCn1VoyKW19z9GNWry8oyVLAyA1qBv3hUXQbTQsU/F3sEu mJF8uk4UJlWK8mvvY7r7VIMSRV/0Vrry8WP38jUznjxDDcntlDg+qZeR606ZPYb9pmEj WSmQ/M5u2dLQ+LKxdMsDIHuKkcmPZArZLgnkVK+51Ym1MBBETQ3zGQT2O/Lur93wgmO1 QjqY9wYiMfxnNZsYkRNlCgO4aHaQ+oIjuR+2P6/PoxEHMBIae2xeGEquvedrJM3OKVVV LA6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747818435; x=1748423235; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DeNYPdomACHa8wDhiFaUfgnQyYUfpnrLdwpGjvcnAOM=; b=tweDELOAHTTpW01Fa6Z3I/2NY00hdFxj9/RsGosUNSEZQTOIo/vCoohjGU0ETbwkjn iq3GWnHOQjNE0FFngSdewqedR3yhyPi9aJIF0CsSHU1rSS/mVTvrJu8+/np+BxW2hUov zXb68t8bLoCucqd+/QPuJH+paxkdgrh1Bi6SnAw7HP6lpysHNBPp0Gm88V2gFrFTlv4f DhXwGfNFc42um4z7FG0xBdBO4h87dd8nHmIA/B27fGSbIA4yTvb+XOuQPrr94WC2WLOd 5qoMCyX/P7n9w7a4iRjLDVN/mly1gr/L8Z2ovUvrsDEpJ69ekWX572Hy02QLPUjDg36Z Ht0w== X-Forwarded-Encrypted: i=1; AJvYcCVXZ0C+j/n8K3i0Cq/QNoVn+0w6gxyV9+dL8YCAj5aK1abY4+KRZK1Zyyfzt8I8a7X1kSJEt1KHHwcfoQ==@lists.postgresql.org X-Gm-Message-State: AOJu0YxSSoDlNd05zRae4U5xkNpvWvCodlZJEjdp6auTMc2nwGf/hclJ b0VLzoj3PRicpHv1qNdk2XHLRtlmg00GRQqqXGdOkzYXqhCflOoil/6NW0H2sOA9ny+jtbZl6Im RlgbGTWk2oxicygWISG4Ya4S5wIKkpsg= X-Gm-Gg: ASbGncv64pTgVap4ZxAJp3XN7Kcx773nOTycpSqSBHTXb90k6qbmPqWh1aq+RT+Pc+i m8ztl2kvCgCK2fiN19RNa86UnLW7M/Xpt3a1slJpSYF+iqeCxdtgKkjGgKdfj6+7Da8q8MefJlC jpeT5LraV0Q/S1DO/Ad1f/EHPSZP3nSKSs X-Google-Smtp-Source: AGHT+IHL1CTVZFhbQMhYacfccrDdQmC1PNaBuP6g8iMIWr7p78MLrixunhuYcKctaz1FTwiWKzLGdxVvgYNPSYFONPk= X-Received: by 2002:a05:6902:1542:b0:e7d:4c86:bde5 with SMTP id 3f1490d57ef6-e7d4c86cc6bmr7722427276.33.1747818435046; Wed, 21 May 2025 02:07:15 -0700 (PDT) MIME-Version: 1.0 References: <32ad0fda77629362dbdc90136e6d5f667d496e01.camel@cybertec.at> In-Reply-To: From: =?UTF-8?B?0JDQvdGC0L7QvSDQk9C70YPRiNCw0LrQvtCy?= Date: Wed, 21 May 2025 12:06:54 +0300 X-Gm-Features: AX0GCFt-BRFPnw2_ivvtNYeCsQgbYauHIzaWtY48YkalKfmBWqCEzLMWVj5EzJk Message-ID: Subject: Re: query hangs out To: ikramuddin , pgsql-admin@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000f3fdff0635a1b0ee" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f3fdff0635a1b0ee Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable The problem is with only one table. As a result, I determined that the problem is on page 5 of the table (I made SELECT ctid selections until it hangs). Then I tried to delete rows by ctid (5, 0-100) from the table until I found the problematic row. Content through pageinspect: # SELECT * FROM heap_page_items(get_raw_page('"InboxState"', 5)) where lp = =3D 51; -[ RECORD 1 ]--------------------------------------------------------------------------= ----------------------------------------------------------------- lp | 51 lp_off | 3760 lp_flags | 1 lp_len | 100 t_xmin | 136269917 t_xmax | 66664135 t_field3 | 0 t_ctid | (47,13) t_infomask2 | 8203 t_infomask | 4929 t_hoff | 32 t_bits | 1111011000000000 t_oid | t_data | \x3e8a7c00000000000100000090877a16b4b308dd9460898784c4af2dab692693d29bdf78b= cf5153401000000fd55f20f44ec08dd9460f88969b943ab3cd8020000000000 I couldn't delete it in the standard way (delete from "InboxState" where ctid =3D '(5,51)') - it also hangs. But I can freeze it through pg_surgery. # select heap_force_freeze('"InboxState"'::regclass, ARRAY['(5, 51)']::tid[]); Output after freeze: digitalarchive=3D# SELECT * FROM heap_page_items(get_raw_page('"InboxState"= ', 5)) where lp =3D 51; -[ RECORD 1 ]--------------------------------------------------------------------------= ----------------------------------------------------------------- lp | 51 lp_off | 3760 lp_flags | 1 lp_len | 100 t_xmin | 2 t_xmax | 0 t_field3 | 0 t_ctid | (5,51) t_infomask2 | 11 t_infomask | 2817 t_hoff | 32 t_bits | 1111011000000000 t_oid | t_data | \x3e8a7c00000000000100000090877a16b4b308dd9460898784c4af2dab692693d29bdf78b= cf5153401000000fd55f20f44ec08dd9460f88969b943ab3cd8020000000000 After that, queries to the table started to work normally. I'll note that there are absolutely no errors in the postgres logs, checksums are enabled, there are no errors for them either. It seems that this is a bug. =D1=81=D1=80, 21 =D0=BC=D0=B0=D1=8F 2025=E2=80=AF=D0=B3. =D0=B2 02:52, ikra= muddin : > Is it taking too long only for this table or other tables also? If the > issue is with this single table then check when it started to happened , > mean after creating one index or whatever change you perform just get bac= k > to that point and now the query should run fine > > > > > On Tue, 20 May 2025 at 9:14=E2=80=AFPM, =D0=90=D0=BD=D1=82=D0=BE=D0=BD = =D0=93=D0=BB=D1=83=D1=88=D0=B0=D0=BA=D0=BE=D0=B2 > wrote: > >> Thanks for the advice. >> I tried to remove all indexes and constraints from the table - it did no= t >> help. >> I have a copy of the data (before truncate) - I can test any hypothesis >> >> =D0=B2=D1=82, 20 =D0=BC=D0=B0=D1=8F 2025=E2=80=AF=D0=B3. =D0=B2 18:25, L= aurenz Albe : >> >>> On Tue, 2025-05-20 at 16:48 +0300, =D0=90=D0=BD=D1=82=D0=BE=D0=BD =D0= =93=D0=BB=D1=83=D1=88=D0=B0=D0=BA=D0=BE=D0=B2 wrote: >>> > I encountered a very strange behavior. >>> > For any query (even a simple count(*) to one specific table (a small >>> 30MB table with 3 indexes, >>> > without any specific data types - everything is standard out of the >>> box vanilla Postgres) - >>> > the query hangs dead. Waited more than 24 hours - the query did not >>> complete). >>> > >>> > >>> > Similarly, the vacuum process to the table hangs. >>> > Only Kill -9 with a full restart helps >>> > >>> > I get a backtrace, from it - I then examined the pg_multixact >>> directory, which at the time of >>> > the problem had swelled to 900MB and had several thousand files. >>> > I excluded long and inactive transactions, as well as prepared >>> statements. >>> > >>> > The workaround in the end was this - truncate the table (it was >>> successful), then vacuum freeze >>> > each DB, and after that the files from pg_multixact disappeared. >>> > >>> > What could it be? vacuum\freeze\mulitxact settings are default. >>> > At the same time, the value pg_database.datminmxid=3D1 >>> > Could the problem with the hang be related to the many old files in >>> pg_multixact ? (judging by the backtrace - yes) >>> >>> I can't say for certain, but I have seen cases like that where index >>> corruption sent >>> processes into an endless loop. Next time you could try to rebuild the >>> indexes. >>> >>> Yours, >>> Laurenz Albe >>> >> --000000000000f3fdff0635a1b0ee Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
The problem is with only one table.

As a result, I = determined that the problem is on page 5 of the table (I made SELECT ctid s= elections until it hangs).
Then I tried to delete rows by ctid (5, 0-100= ) from the table until I found the problematic row.

Content through = pageinspect:
# SELECT * FROM heap_page_items(get_raw_page('"Inb= oxState"', 5)) where lp =3D 51;
-[ RECORD 1 ]------------------= ---------------------------------------------------------------------------= ----------------------------------------------
lp =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0| 51
lp_off =C2=A0 =C2=A0 =C2=A0| 3760
lp_flags =C2=A0 = =C2=A0| 1
lp_len =C2=A0 =C2=A0 =C2=A0| 100
t_xmin =C2=A0 =C2=A0 =C2= =A0| 136269917
t_xmax =C2=A0 =C2=A0 =C2=A0| 66664135
t_field3 =C2=A0 = =C2=A0| 0
t_ctid =C2=A0 =C2=A0 =C2=A0| (47,13)
t_infomask2 | 8203
= t_infomask =C2=A0| 4929
t_hoff =C2=A0 =C2=A0 =C2=A0| 32
t_bits =C2=A0= =C2=A0 =C2=A0| 1111011000000000
t_oid =C2=A0 =C2=A0 =C2=A0 |
t_data = =C2=A0 =C2=A0 =C2=A0| \x3e8a7c00000000000100000090877a16b4b308dd9460898784c= 4af2dab692693d29bdf78bcf5153401000000fd55f20f44ec08dd9460f88969b943ab3cd802= 0000000000


I couldn't delete it in the standard way (delete = from "InboxState" where ctid =3D '(5,51)') - it also hang= s.

But I can freeze it through pg_surgery.

# select heap_forc= e_freeze('"InboxState"'::regclass, ARRAY['(5, 51)'= ;]::tid[]);

Output after freeze:
digitalarchive=3D# SELECT * FROM= heap_page_items(get_raw_page('"InboxState"', 5)) where l= p =3D 51;
-[ RECORD 1 ]-------------------------------------------------= ---------------------------------------------------------------------------= ---------------
lp =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| 51
lp_off =C2= =A0 =C2=A0 =C2=A0| 3760
lp_flags =C2=A0 =C2=A0| 1
lp_len =C2=A0 =C2= =A0 =C2=A0| 100
t_xmin =C2=A0 =C2=A0 =C2=A0| 2
t_xmax =C2=A0 =C2=A0 = =C2=A0| 0
t_field3 =C2=A0 =C2=A0| 0
t_ctid =C2=A0 =C2=A0 =C2=A0| (5,5= 1)
t_infomask2 | 11
t_infomask =C2=A0| 2817
t_hoff =C2=A0 =C2=A0 = =C2=A0| 32
t_bits =C2=A0 =C2=A0 =C2=A0| 1111011000000000
t_oid =C2=A0= =C2=A0 =C2=A0 |
t_data =C2=A0 =C2=A0 =C2=A0| \x3e8a7c000000000001000000= 90877a16b4b308dd9460898784c4af2dab692693d29bdf78bcf5153401000000fd55f20f44e= c08dd9460f88969b943ab3cd8020000000000


After that, queries to the= table started to work normally.
I'll note that there are absolutely= no errors in the postgres logs, checksums are enabled, there are no errors= for them either.

It seems that this is a bug.

=D1=81=D1=80, 21 =D0=BC=D0=B0=D1=8F 2025=E2=80=AF=D0=B3. =D0=B2 02:52, ik= ramuddin <ikram.amani815@gma= il.com>:
=
Is it taking too long only for this table or other tables= also? If the issue is with this single table then check when it started to= happened , mean after creating one index or whatever change you perform ju= st get back to that point and now the query should run fine




= On Tue, 20 May 2025 at 9:14=E2=80=AFPM, =D0=90=D0=BD=D1=82=D0=BE=D0=BD =D0= =93=D0=BB=D1=83=D1=88=D0=B0=D0=BA=D0=BE=D0=B2 <a.glushakov86@gmail.com> wrote:<= br>
Thanks for the advice.
I tried to remove all indexes and constraints fr= om the table - it did not help.
I have a copy of the data (before truncat= e) - I can test any hypothesis


=D0=B2=D1=82, 20 =D0=BC=D0=B0=D1=8F 2025=E2=80=AF=D0=B3. =D0=B2 18:25, = Laurenz Albe <laurenz.albe@cybertec.at>:
On Tue, 2025-05-20 a= t 16:48 +0300, =D0=90=D0=BD=D1=82=D0=BE=D0=BD =D0=93=D0=BB=D1=83=D1=88=D0= =B0=D0=BA=D0=BE=D0=B2 wrote:
> I encountered a very strange behavior.
> For any query (even a simple count(*) to one specific table (a small 3= 0MB table with 3 indexes,
> without any specific data types - everything is standard out of the bo= x vanilla Postgres) -
> the query hangs dead. Waited more than 24 hours - the query did not co= mplete).
>
>
> Similarly, the vacuum process to the table hangs.
> Only Kill -9 with a full restart helps
>
> I get a backtrace, from it - I then examined the pg_multixact director= y, which at the time of
> the problem had swelled to 900MB and had several thousand files.
> I excluded long and inactive transactions, as well as prepared stateme= nts.
>
> The workaround in the end was this - truncate the table (it was succes= sful), then vacuum freeze
> each DB, and after that the files from pg_multixact disappeared.
>
> What could it be? vacuum\freeze\mulitxact=C2=A0 settings are default.<= br> > At the same time, the value pg_database.datminmxid=3D1
> Could the problem with the hang be related to the many old files in pg= _multixact ? (judging by the backtrace - yes)

I can't say for certain, but I have seen cases like that where index co= rruption sent
processes into an endless loop.=C2=A0 Next time you could try to rebuild th= e indexes.

Yours,
Laurenz Albe
--000000000000f3fdff0635a1b0ee--