Re: query hangs out - Антон Глушаков

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Антон Глушаков <[email protected]>
To: ikramuddin <[email protected]>
To: [email protected]
Subject: Re: query hangs out
Date: Wed, 21 May 2025 12:06:54 +0300
Message-ID: <CAHnOmacs_JB_wjA9g=PZyuH5jtbFb_qkf-dS7h7H4bkS4nTQeg@mail.gmail.com> (raw)
In-Reply-To: <CAL9MbytuyCsNoKK4Aus8zsXYiBz+G+YAm+Gvoa1Pz+SvGO7fBA@mail.gmail.com>
References: <CAHnOmadn1UB-t-=Umd_TSEZ=kw48=ecX3EnesABxbPdboB-ZUQ@mail.gmail.com>
	<[email protected]>
	<CAHnOmafAW_Dqc8NEkmi=HOOMp3xf1DZdtOXauU2c1N5hq9BnVw@mail.gmail.com>
	<CAL9MbytuyCsNoKK4Aus8zsXYiBz+G+YAm+Gvoa1Pz+SvGO7fBA@mail.gmail.com>

The problem is with only one table.

As a result, I determined that the problem is on page 5 of the table (I
made SELECT ctid selections until it hangs).
Then I tried to delete rows by ctid (5, 0-100) from the table until I found
the problematic row.

Content through pageinspect:
# SELECT * FROM heap_page_items(get_raw_page('"InboxState"', 5)) where lp =
51;
-[ RECORD 1
]-------------------------------------------------------------------------------------------------------------------------------------------
lp          | 51
lp_off      | 3760
lp_flags    | 1
lp_len      | 100
t_xmin      | 136269917
t_xmax      | 66664135
t_field3    | 0
t_ctid      | (47,13)
t_infomask2 | 8203
t_infomask  | 4929
t_hoff      | 32
t_bits      | 1111011000000000
t_oid       |
t_data      |
\x3e8a7c00000000000100000090877a16b4b308dd9460898784c4af2dab692693d29bdf78bcf5153401000000fd55f20f44ec08dd9460f88969b943ab3cd8020000000000


I couldn't delete it in the standard way (delete from "InboxState" where
ctid = '(5,51)') - it also hangs.

But I can freeze it through pg_surgery.

# select heap_force_freeze('"InboxState"'::regclass, ARRAY['(5,
51)']::tid[]);

Output after freeze:
digitalarchive=# SELECT * FROM heap_page_items(get_raw_page('"InboxState"',
5)) where lp = 51;
-[ RECORD 1
]-------------------------------------------------------------------------------------------------------------------------------------------
lp          | 51
lp_off      | 3760
lp_flags    | 1
lp_len      | 100
t_xmin      | 2
t_xmax      | 0
t_field3    | 0
t_ctid      | (5,51)
t_infomask2 | 11
t_infomask  | 2817
t_hoff      | 32
t_bits      | 1111011000000000
t_oid       |
t_data      |
\x3e8a7c00000000000100000090877a16b4b308dd9460898784c4af2dab692693d29bdf78bcf5153401000000fd55f20f44ec08dd9460f88969b943ab3cd8020000000000


After that, queries to the table started to work normally.
I'll note that there are absolutely no errors in the postgres logs,
checksums are enabled, there are no errors for them either.

It seems that this is a bug.

ср, 21 мая 2025 г. в 02:52, ikramuddin <[email protected]>:

> Is it taking too long only for this table or other tables also? If the
> issue is with this single table then check when it started to happened ,
> mean after creating one index or whatever change you perform just get back
> to that point and now the query should run fine
>
>
>
>
> On Tue, 20 May 2025 at 9:14 PM, Антон Глушаков <[email protected]>
> wrote:
>
>> Thanks for the advice.
>> I tried to remove all indexes and constraints from the table - it did not
>> help.
>> I have a copy of the data (before truncate) - I can test any hypothesis
>>
>> вт, 20 мая 2025 г. в 18:25, Laurenz Albe <[email protected]>:
>>
>>> On Tue, 2025-05-20 at 16:48 +0300, Антон Глушаков wrote:
>>> > I encountered a very strange behavior.
>>> > For any query (even a simple count(*) to one specific table (a small
>>> 30MB table with 3 indexes,
>>> > without any specific data types - everything is standard out of the
>>> box vanilla Postgres) -
>>> > the query hangs dead. Waited more than 24 hours - the query did not
>>> complete).
>>> >
>>> >
>>> > Similarly, the vacuum process to the table hangs.
>>> > Only Kill -9 with a full restart helps
>>> >
>>> > I get a backtrace, from it - I then examined the pg_multixact
>>> directory, which at the time of
>>> > the problem had swelled to 900MB and had several thousand files.
>>> > I excluded long and inactive transactions, as well as prepared
>>> statements.
>>> >
>>> > The workaround in the end was this - truncate the table (it was
>>> successful), then vacuum freeze
>>> > each DB, and after that the files from pg_multixact disappeared.
>>> >
>>> > What could it be? vacuum\freeze\mulitxact  settings are default.
>>> > At the same time, the value pg_database.datminmxid=1
>>> > Could the problem with the hang be related to the many old files in
>>> pg_multixact ? (judging by the backtrace - yes)
>>>
>>> I can't say for certain, but I have seen cases like that where index
>>> corruption sent
>>> processes into an endless loop.  Next time you could try to rebuild the
>>> indexes.
>>>
>>> Yours,
>>> Laurenz Albe
>>>
>>

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: query hangs out
  In-Reply-To: <CAHnOmacs_JB_wjA9g=PZyuH5jtbFb_qkf-dS7h7H4bkS4nTQeg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox